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Abstract 



Quasar absorption lines provide a precise test of the assumed constancy of the fundamental 
constants of physics over cosmological times and distances. We have used quasar absorp- 
tion lines to investigate potential changes in the fine-structure constant, a = e^/(47reo^c), 
and the proton-to-electron mass ratio, ji = mp/nie. 

The many-multiplet method allows one to use optical fine-structure transitions to constrain 
Aa/a at better than the 10~^ level. We present a new analysis of 154 quasar absorbers 
with 0.2 < z < 3.7 in VLT/UVES spectra. Prom these absorbers we find 2.2a evidence 
for angular variations in a under a dipole-|-monopole model. Combined with previous 
Keck/HIRES observations, we find 4.1cr evidence for angular (and therefore spatial) vari- 
ations in a, with maximal increase of a occurring in the direction RA = (17.3 ± 1.0) hr, 
dec. = (—61 lb 10)°. Under a model where the observed effect is proportional to the 
lookback-timc distance the significance increases to 4.2a. Importantly, dipolc models fit- 
ted to the VLT and Keck samples independently yield consistent estimates of the dipole 
direction, which suggests that the effect is not caused by telescope systematics. Similarly, 
dipole models fitted to z < 1.6 and ^; > 1.6 sub-samples also point in a consistent direc- 
tion. The observed dipole effect is stable under iterative trimming of potentially outlying 
Aa/a values, implying that the result is not being generated by a subset of the data. We 
consider a number of systematic effects, including potential wavelength scale distortions 
and evolution in the abundance of Mg isotopes, and show that they are unable to ex- 
plain the observed dipole effect. If these results are correct, they directly demonstrate the 
incompleteness of the Standard Model and violation of the Einstein Equivalence Principle. 

Optical spectra of 2; > 2 molecular hydrogen absorbers can probe evolution in fi. We have 
used spectra of the quasars Q0405-443, Q0347-383 and Q0528-250 from VLT/UVES 
to investigate the absorbers at z = 2.595, 3.025 and 2.811 in these spectra respectively. 
We find that A/x//x = (10.1 ± 6.6) x 10'^, (8.2 ± 7.5) x 10"^ and (-1.4 ± 3.9) x 10"^ 
in these absorbers respectively. A second spectrum of Q0528— 250 provides an additional 
constraint of Afx/fx = (0.2 ± 3.2stat ± l-9sys) x 10~^. The weighted mean of these values 
yields Afi/fi = (1.7 it 2.4) x 10~^, the most precise constraint on evolution in /x at 2; > 1. 

We also demonstrate the application of Markov Chain Monte Carlo methods to determin- 
ing Aa/a from quasar spectra. 
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Chapter 1 

Introduction 



This chapter provides an introduction to the field of varying constants. We give an 
overview of the early history of the field, as well as some interesting recent developments. 
Our primary work is a two-pronged analysis of the proton-to-electron mass ratio, fi, and 
the fine-structure constant, a, using quasar spectra. In this chapter, we introduce these 
constants, and review constraints on changes in these constants derived from methods 
which do not utilise quasar spectra. We consider constraints derived from quasars in 
chapters 3 and 4 respectively. We give more details of the structure of this work in section 
1-5. Chapters 3 through 7, all involve the analysis of quasar absorption lines, and so we 
describe our methodology where it is common to these chapters in chapter 2. 



1-1 What are fundamental constants? 



Any formulation of physics is inextricably linked with a system of units. All numerical 
quantities must ultimately be compared with some standard. Many units in any system are 
superfluous, and can be reduced to combinations of some set of base units. The SI system 
lists seven units, namely: the kilogram, the second, the metre, the ampere, the kelvin, the 
mole, and the candela. These in turn relate to underlying dimensions, namely mass, length 
and time, electric charge and temperature. Amongst these dimensions, some consider only 
mass, length and time as fundamental. The use of the metre, the kilogram and the second 
as base units of length, mass and time respectively are not unique even within modern 
physics. Uzan (2003) noted that the SI system is only useful for measurements that are 
"of human size". One can construct unit systems that are more appropriate for other 
regimes. A common basis for high-energy physics is to construct a unit mass me, a unit 
length Aneoh'^ /m^.e'^ and a unit time 2eQh^ /irmee^ (Uzan, 2003). 

Although the SI system is convenient, the various units in it are not fundamental. A more 
rational system for fundamental science associates units with certain physical constants, 
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which seem to be intrinsic properties of our universe. Flowers & Petley (2001) considered 
the following as fundamental: the electron charge e, the proton mass rup, the reduced 
Planck constant h, the velocity of light in a vacuum c, the Avogadro constant N^, the 
Boltzmann constant ks, the Newtonian gravitational constant G, and the permittivity and 
permeability of free space eo and /io- Clearly some of these quantities are not independent, 
as for example eofiQ = 1/c^, and in theory Avogadro's constant can be derived from 
a sufficient ability to weigh and count atoms. Okun (1991) considered that only three 
fundamental quantities are necessary: the metre, the second and the kilogram. 

This work is concerned primarily with two constants of fundamental importance. The first 
is the proton-to-electron mass ratio, fx = nip/me, which is simply the ratio of the proton 
mass to the electron mass. The second is the fine-structure constant, a = / {A-neohc). 
The importance of these two constants is that they, along with an energy scale, completely 
define the gross structure of atomic and molecular systems (Born, 1935). Besides the 
obvious physical implications, this also means that much of chemistry ultimately hinges 
on these two numbers. Thus, these two numbers ultimately have an enormous impact on 
the universe. 

The constants that appear in our theories determine the proportionality between different 
quantities. As our knowledge increases, some constants are deprecated by our new-found 
ability to relate different quantities. The clearest historical example of this is gravitation, 
where prior to Newton it was widely believed that the acceleration due to gravity, g, was 
a universal quantity. Newton's inverse square law of gravitation yielded the force between 
two masses as, F = Gmim2/r^, which replaced one constant with another, but yielded 
a relationship of much broader generality. G still retains the status of a fundamental 
constant today, the value of which we are unable to predict from other quantities. In fact, 
a reasonable definition of a fundamental physical constant at present is any proportionality 
constant of a fundamental theory which cannot be predicted. 

The standard model of physics, together with gravity, requires 22 unknown constants: the 
Newtonian constant, six Yukawa couplings for the quarks and three for the leptons, the 
mass and vacuum expectation value of the Higgs field, four parameters for the Cabibbo- 
Kobayashi-Maskawa (CKM) matrix, three coupling constants, a UV cutoff including the 
speed of light and Planck's constant (Hogan, 2000; Uzan, 2009). We require three constants 
to define a system of units, leaving 19 unexplained dimensionless parameters. Indeed, the 
problem becomes worse with the discovery that neutrinos must be massive (see for instance 
Amsler et al., 2008). This implies at least seven more parameters in the standard model 
(three Yukawa couplings and four CKM parameters) (Uzan, 2009). The existence of a large 
number of free parameters in our fundamental theories is almost a prima facie suggestion 
that these theories are incomplete; one would hope that an all-encompassing fundamental 
theory would have far fewer free parameters (and perhaps none). On the other hand, it 
remains to be seen whether any of the fundamental constants can truly be predicted from 
theory — some or all of them may turn out to be independent properties of the universe. 
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set seemingly at random and with no relationship to the laws which describe the dynamics 
of particles. 

There is a strong (although oft-forgotten) assumption within physics that constants are 
just that: constant. In fact, we already know that our constants are not the same under 
all regimes: the coupling constants of the three forces of the standard model "run" with 
energy, but the high-energy values can nevertheless be expressed in terms of their low- 
energy values. This fact aside, the accepted position is that our constants are invariant 
throughout time and space. 

How does one look for a change in constants? A naive approach is to search for variations 
in a convenient constant, such as the speed of light, in different times and places. Although 
such a variation might be found, the interpretation is severely hampered. A variation in c 
could mean any or all of the following: i) the physics underlying the propagation of light 
is changing; ii) the length of the metre is changing, in) the length of the second is chang- 
ing. These possibilities cannot be disentangled. Dicke (1962) notes the solution to this 
problem: to only search for variation in dimensionless quantities. Detection of variation 
in a dimensionless quantity guarantees that it is the quantity under consideration which 
is changing, and not any aspect of the unit system. Changes in dimensionful quantities 
can be measured, but this necessarily entails an explicit statement about which units are 
assumed to be held fixed. Measured change in a dimensionless quantity would therefore 
unambiguously imply that physics is changing. 

1-2 History of varying constants 
1-2.1 Early considerations 

The first investigations into whether the fundamental constants vary were due to Milne 
(Milne, 1935, 1937) and Dirac (Dirac, 1937), who suggested that G might vary with cos- 
mological time. Dirac's work in particular was a response to several observed apparent 
coincidences between large numbers (this subsequently became known as the Large Num- 
ber Hypothesis, or LNH). In particular, Dirac noted that in atomic units of time the 
age of the universe is ~ 10^*^, whilst the number of protons in the observable universe is 
~ 10^*^. 10^*^ is of the same order of magnitude of the ratio of the strength of the electrical 
forces between a proton and electron to the strength of the gravitational force between 
them. This led Dirac to speculate that perhaps these quantities were fundamentally in- 
terrelated, and that, perhaps, G oc 1/t and M cc (where M is the amount of mass in 
the universe). Teller (1948) objected that the implications of this cosmology were incon- 
sistent with paleontological data. However, Gamow (1967) showed that interpreting the 
LNH as allowing for time variation of e rather than G side-stepped Teller's objections. 
Although the suggestions of Milne and Dirac were based on numerological rather than 
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physical grounds, the discussion around them serves the purpose of showing that people 
have seriously considered the variation of fundamental constants for quite some time. 

Brans & Dicke (1961) placed the variation of G on a more rigorous footing by developing 
a self-consistent scalar-tensor extension of Einstein's General Relativity (GR), where the 
tensor component describes the classical GR behaviour, whilst the scalar part describes 
the propagation of a scalar field, which itself is a source of space-time curvature. So- 
called Brans-Dicke theories (and their extensions) are still considered as objects of interest, 
although constraints on them have become increasingly stringent. The theory predicts that 
the post-Newtonian 7 parameter will deviate from the standard GR value of 1, instead 
giving 7 = (u) + l) I (a;-|-2) (Weinberg, 1972) where w is the dimensionless coupling constant 
of the scalar field. Precision measurements of the Cassini spacecraft require \bj\ > 40, 000 
(Bertotti et al., 2003), which is "uncomfortably large" (Moffat & Toth, 2010). 

1-2.2 Motivations for varying constants 

Varying constants in the 21st century? A crazy thought, some might think. Modern phys- 
ical theories have enormous predictive power, and some might be content with the status 
quo (although certainly not Sir Karl Popper). Yet, it is well known that General Relativ- 
ity (which describes gravity) and the Standard Model (which describes electromagnetism, 
and the strong and weak nuclear forces) are incompatible. We do not have a good theory 
of quantum gravity at present. The incompleteness of our theories is reason enough to try 
to subject them to every test we can imagine. 

Fundamentally, it must be noted that it is known experimentally that the contants "run" 
with energy (that is, they take on different values at high energy scales), and so variation 
of the physical constants under different local energy regimes has already been shown. 
There is no known law or symmetry principle - other than an assumption for the sake of 
simplicity - which prevents the constants of nature from varying in space and time. Thus, 
it is necessary to check this assumption experimentally. 

Notwithstanding the desire to try to falsify some aspect of our current understanding of 
physics, there are lines of argument which suggest that a variation of the fundamental con- 
stants might be possible, or even desirable. It has been explicitly shown that cosmological 
variation in the constants may proceed differently in different places and times (Forgacs & 
Horvath, 1979; Barrow, 1987; Damour & Polyakov, 1994; Li & Gott, 1998). Additionally, 
in any model of the universe with extra dimensions then the constants of nature must 
vary, although the magnitude of the variation is not constrained by theory (Kaluza, 1921; 
Klein, 1926; Forgacs h Horvath, 1979; Barrow, 1987; Li & Gott, 1998). 

Two interesting considerations exist which are worth presenting. We consider here the 
well-known argument about the triple-a process, which is of significant historical interest. 
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and also a line of argument which has emerged from an apparent cosmological deficit of 
'^hi relative to theoretical predictions (the "lithium problem"). 

1-2.2.1 The anthropic principle, the triple-a process and fine-tuning 

An extremely well-known prediction made by Fred Hoyle concerns the existence of the 
triple a resonance, the reaction through which ^^C is produced in our sun. The produc- 
tion of ""^^C depends crucially on the carbon energy level at 7.65 MeV, which is only 0.3 
MeV higher than the sum of the masses of three a particles (Okun, 1996); the relatively 
small difference enhances the cross section of the reaction 3a — t- ^^C. ®Be is unsta- 
ble, and therefore ^^C cannot be produced in sufficient quantities through the reaction 



discovered experimentally, noting that without such a resonance we would not see the 
observed quantities of ^^C in the universe (Dunbar et al., 1953; Hoyle, 1954). That is, 
without this resonance, humans would not exist! The requisite excited state of ^^C has 
become known as the Hoyle level (Ekstrom et al., 2009). 

The reaction rate is extremely sensitive to the energy of the resonance. If Qaaa is the 
energy of the resonance, then the sensitivity of the reaction rate to a variation of Qaaa is 



(Ekstrom et al., 2009) . If the energy level of the Hoyle level were to increase, the amount of 
observed ^^C would be reduced on account of rapid processing to ^"^O (thereby increasing 
the quantity of ^^O), with the converse effect for a reduced energy of the level. It has been 
estimated that carbon or oxygen production would be suppressed by a factor of between 30 
and 1000 if the fine-structure constant differed by more than about four percent (or if the 
strong force was different in strength by more than about half a percent) (Oberhummer 
et al., 2000, 2003; Csoto et al., 2001; Schlattl et al., 2004). Indeed, our universe is effectively 
on-resonance. Thus, some argue that our universe appears to be uniquely fine-tuned for 
the existence of life. . . or at least, life as we understand it (see for instance Davies, 2003). 

Hoyle's argument from the existence of humans for the existence of the resonance is perhaps 
the only known good prediction made using the anthropic principle. Although one might 
take this to argue in favour of a deity or designer of some form, another argument via the 
weak anthropic principle is that there exists a statistical ensemble of universes, in which 
different values of fundamental constants are realised. The weak anthropic principle then 
yields that we simply find ourselves in one of the most life- friendly universes (Okun, 1996). 

What are the actual requirements for life to form? Given our lack of understanding of 
the origin of life, these clearly remain unknown. Nevertheless, it is clear that the laws 
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of physics must allow for the creation of complex structures. Rees (1999) considered 
six dimensionless constants to be important in creating a universe which is amenable to 
complicated structure (and thus life): the ratio of the strength of electromagnetism to that 
of gravity, the strength of the nucleon binding force, the relative importance of gravity 
and expansion energy in the universe, the cosmological constant (A), the ratio of the 
gravitational energy required to unbind a galaxy to its mass energy equivalent and the 
number of spatial dimensions. Other viewpoints are possible. 

However, whether our universe is fine-tuned or not remains a point of contention. For 
instance, Stenger (2000) considered two numbers of interest. A''! and N2. Ni is the ratio of 
the strength of the electromagnetic force to the gravitational force between two electrons, 
and is given by Ni ~ 10^^. N2 is the ratio of a typical stellar lifetime to the time for 
light to traverse the radius of a proton. Dirac (1937) noted that A'^i ~ Dickc (1961) 
noted that must be large in a universe with life, so that stars live long enough to 
generate heavy elements. He also noted that Ni must be of similar magnitude in order for 
the universe to have elements heavier than lithium. Stenger simulated different universes 
in which fundamental parameters differ. In particular, he varied a (the fine-structure 
constant), as (the strong nuclear interaction strength at low energy), nie (the electron 
mass) and mp (the proton mass). If one defines the dimensionless gravitational strength 
as ug = Gmp{hc)^^ , then A^i = {a/aG)tJ' and N2 = aOsfJ-Ni (see Stenger, 2000, and 
references therein). In 100 toy universes, Stenger generated each of the four parameters 
above from a range of four orders of magnitude below their values in our universe to four 
orders of magnitude above. For this range of parameters, A'^i > 10^^ and > 10^*^ in 
most cases. He noted that although A'^i ~ does not occur in most cases, nevertheless 
an approximate coincidence between these two quantities is not rare either. He concluded 
that a rather wide variation in the fundamental constants still produces universes in which 
complex matter can form, and thus perhaps life. 

It must be said that Stenger's arguments are themselves not without criticism. Barnes 
(2010) gave a wide range of criticisms of Stenger's analysis. He noted that one only 
needs to find a single instance of fine-tuning for the universe to be fine-tuned. Conversely, 
showing that simple toy universes are amenable to life does not prove that the universe is 
fine-tuned. Barnes suggested that Stenger ignores the following requirements in his model 
universes: i) the stability of atoms; ii) the need to be able to form complex structures 
(showing that stars can exist does not show that the complex chemical structures necessary 
for life are stable); Hi) the need for suitable stars; iv) the need for large planets (if gravity 
is too strong, planets which support life will be too small to support ecosystems), and; 
iv) other constraints on the masses of fundamental particles. Barnes also noted that the 
choice of priors on the parameters necessarily leads to the conclusion of long-lived stars 
in half of the universes considered. Another objection is simply that the analysis is too 
simplistic for the strength of the claim made; the detailed work of Obcrhummcr ct al. 
(2000, and other references above) seems to clearly show the constraints placed on the 
strength of electromagnetism and the strong force. Hogan (2000) considered constraints 
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on fundamental particles necessary for complex structure, and noted that the difference 
between the mass of the up and down quarks appears quite finely tuned. 

Ultimately, whether our universe is fine-tuned for life is difficult to resolve. However, if one 
accepts that our universe is fine-tuned for life, then arguments such as that of Okun (1996) 
provide an escape from invoking the existence of a creator; variations of the fundamental 
constants can help defuse what seems otherwise to be a rather special situation that we 
find ourselves in. Of course, perhaps if we understood physics better (and thus understood 
the true origin of the physical constants, if such an understanding is possible) then this 
problem might be solved in any event. We return to consideration of the triple a process 
in light of the results of chapter 4 in section 6-1.4. 

1-2.2.2 The "lithium problem" 

A review of this problem has been recently presented by Berengut et al. (2010a), so we 
present the issues in summary here. 

Big Bang nucleosynthesis (BBN) theory attempts to predict the observed abundances of 
elements from fundamental physics. BBN theory, coupled with precise measurements of 
the neutron half-life and the WMAP measurements of the baryon-to-photon ratio, r/, have 
made the theory essentially parameter free (Berengut et al., 2010a; Amsler et al., 2008; 
Cyburt et al., 2008). There is excellent agreement between the predicted abundances of 
deuterium and ^He, however BBN overpredicts the abundance of ''Li (Amsler et al., 2008). 
This is known as the "lithium problem". BBN overpredicts the amount of ''Li produced 
by a factor of between 2.4 and 4.3 compared with observation (Cyburt et al., 2008). This 
difference is significant at the 4 to ba level. The abundance of ^Li is determined from 
metal-poor population II stars in our galaxy (Asplund et al., 2006; Bonifacio et al., 2007; 
Hosford et al., 2009). It is noted observationally that the lithium abundance does not vary 
particularly over many orders of metallicity in the stars considered (an effect known as 
the Spite plateau) (Spite & Spite, 1982). 

The rates of reactions which produce ^Li are sensitive to the value of certain fundamental 
constants or derivatives thereof. In particular, the predictions of BBN are sensitive to 
the deuterium binding energy, B^-, as this determines the temperature at which deuterium 
is subject to photo-disintegration and therefore the time at which nucleosynthesis begins 
(Dmitriev et al., 2004). Dmitriev et al. (2004) varied Bd to minimise the ^Li discrepancy, 
and found that AB^/Bd = (—0.019 it 0.005) — possible evidence for variation of B^. 

Flambaum & Wiringa (2007) and Berengut et al. (2008) considered the effect of variation 
of Xq = m-g/AqcD) where ruq is the light quark mass and Aqcd is the pole in the running 
strong-coupling constant. They parameterise AXq/Xq = Aniq/mq (this does not assume 
that Aqcd is constant but instead assumes that all dimensions are in units of Aqcd)- They 
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found that allowing for variation of niq can resolve the discrepancy between predicted and 
observed ^Li abundances, but only if one ignores the shift in the resonances resonances 
for certain reactions. In particular, they examined the effect of a variation in Xq on 
the reactions '^He(d, p) o '^Hc and t{d,n) o '^Hc, where the reaction cross-section is 
dominated by a narrow resonance. Including these resonances leads to the conclusions 
that variations in Xq may not be able to explain the lithium problem. However, they 
noted that they have not considered the effect of the ^He* and ^Li* resonances, which may 
be very sensitive to AXq/Xq, and leave this consideration for future work. 

Thus, although it seems like variation of fundamental constants might be able to resolve 
the discrepancy between BBN theory and the observed ^Li abundance, more work is clearly 
needed. 



1-3 How to find variation in a constant 

Most methods of searching for variations in a dimensionless constant share the same 
fundamental derivation. For a dimensionless constant, P, and observable quantity, O, one 
attempts to derive a change in the observed quantity as a function of a change in a relevant 
dimensionless ratio 

AP f A'^P\ 
AO = k— + ol^-^j, (1.1) 

where k determines the sensitivity to the effect; for a particular circumstance k is referred 
to as the "sensitivity coefficient". The second order term can be neglected in almost all 
circumstances as the variations in the fundamental constants, if they occur, are small in 
all regimes in which can be currently be probed, although clearly if exact expressions are 
available they should be used (i.e. AO = k x f[AP/P]). In many circumstances, multiple 
dimensionless constants are relevant to the problem, in which case this becomes a sum 
over the constants of interest, 

AP- 

One then compares the observations of O at different time periods to probe temporal 
evolution in the various Pi, or at different places to probe spatial variation in Pi. Some 
care is required in disentanghng the effects, as observations to large distances necessarily 
entail observations to the deep past due to the finite speed of light. 

As we search for a variation in fj, and a, we define the quantity 

A/x _ /x^ - Ho 



(1.3) 



where fXz is the measurement of /x at some redshift z, and /xq is the laboratory value. 
Similarly, we define 

Act; _ az — C(f) 



a ao 



(1.4) 
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No variation of a fundamental constant has been conclusively accepted at present, and 
thus the goal of experimentation is to obtain higher accuracy and precision. For temporal 
evolution, there are two primary paths available. One is to use long temporal base-lines, 
with the hope that small changes will become magnified. This path leads directly to 
astrophysical observations, which can probe effectively the entire age of the universe by 
looking to sufficiently high redshift. Certain aspects of the solar system also carry the 
integrated history of the physics they have been subjected to, which allows the probing of 
about ~ 5 billion years into the past. Although legitimate, astrophysical methods suffer 
from the fact that one can only observe the past, not experiment on it, and therefore con- 
trolling systematic errors may be difficult. The other path is to perform experiments over 
human time scales, but attempt to obtain extreme precision (usually through application 
of modern technology and human ingenuity in laboratories). 

Probing spatial variation directly is rather more difficult. This relates to the fact that 
humans are confined to the solar system, and the velocity of the solar system is small 
in any reasonable reference frame (particularly the cosmic microwave background (CMB) 
rest frame). Present-day tests within the solar system (which make use of the Earth's orbit 
around the sun) do not probe large amounts of space relative to the observable universe, 
and therefore detection of spatial variation is difficult. This problem does not hold true for 
astrophysical observations to high redshift, which can not only probe most of the temporal 
history of the universe but also most of the spatial volume of the observable universe. 



1-4 Theories for variation of fundamental constants 



The current investigations into whether the fundamental constants of nature vary are 
limited by experiment. Because the variation of any fundamental constant has not been 
conclusively demonstrated, a cornucopia of theories and models which generate variations 
in the fundamental constants have been created; there is no space to detail most of these 
here. However, as experimental constraints on both present-day and past variation of the 
constants have improved, the parameter space into which these theories can fit is being 
steadily compressed. Unfortunately, many theories suffer from the need to introduce 
parameters which translate into the magnitude of the variation of different constants^. 
There is often no natural magnitude for these parameters, and therefore these theories 
may not be easily falsifiable (or may not be falsifiable at all) — experimental constraints 
simply keep diminishing the magnitude of the parameters. 

Nevertheless, there are two important conclusions to draw from the theoretical approaches 
to generating a variation in the fundamental constants. Firstly, it is possible to construct 
theories as extensions to existing physics which allows for the variation of some or all 

^This seems to replace one constant with another, however discovery of such a mechanism might yield 
further insights toward fundamental theories 
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of the fundamental constants. This is important, because it lends plausibility to the 
idea that the constants might vary, therefore lending weight to experimental searches. 
Secondly, some theories make falsifiable predictions. This is particularly important for the 
case of string theories (and their cohorts), which have held out hope as a post-standard 
model framework, even if they have not yet yielded the revolution that has been hoped 
for. In particular, many string-type models make predictions as to how the fundamental 
constants should vary in relation to each other. Thus, constraints on several different 
constants might be used to constrain or falsify string theories. Given the intellectual 
effort expended on and general lack of accessible laboratory tests for string theories, the 
potential for investigations into the fundamental constants to constrain string theories 
should be taken seriously. 

This work is an experimental one, and the proliferation of theoretical frameworks for 
varying constants continues to grow. Murphy (2002) provides a brief overview. Uzan 
(2003) and Uzan (2010) provide a more wide-ranging treatment of different approaches 
which might be taken. We therefore present here only a brief history of the theoretical 
treatment of the variation of fundamental constants. 

1-4.1 Modern viewpoints 

Although people have attempted to construct theories to contrive a variation in the funda- 
mental constants, it seems that general efforts towards unification of the four fundamental 
forces of nature often naturally produce variation of the constants. Murphy (2002) notes 
that historical attempts are loosely divided into multidimensional unification theories (of 
which the now well-known string theories fall) and scalar field theories. 

Kaluza-Klein theory (Kaluza, 1921; Klein, 1926) derives from the fact that the solution 
of a 5-dimensional extension to GR in fact looks like the standard 4-dimensional GR plus 
Maxwell's equations. This observation has motivated a large interest in attempts to unify 
the fundamental constants of nature through the construction of additional dimensions. 
The extra spatial dimension is proposed to be "compactified" on a microscopic scale, 
therefore explaining why it is not directly observed. More generally, for A^-dimensional 
extensions, the 3D gauge couplings vary as the inverse square of the mean scale of the extra 
dimensions. Evolution in the scale of the extra dimensions therefore leads to variability 
of the observed coupling constants in Kaluza-Klein theories, and in string theories more 
generally. 

Bekenstein (1982) proposed a self-consistent scalar field theory incorporating a varying a. 
In the limit of constant a, the theory reduces to Maxwell's equations. The theory describes 
the evolution of a scalar field, where space-time evolution of the scalar field produces a 
change in a. Although the impact on gravity was originally neglected, it has subsequently 
been included as a modification of the theory (Barrow & Magueijo, 2000; Magueijo, 2000). 
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Dent (2008) considers constraints placed on coupling between a scalar field and particular 
constants in light of recent data. 

1-4.2 Relationships between variation in different constants 

As noted earlier, within the framework of Grand Unified Theories (GUTs), and also string 
theories, one can derive approximate relationships between changes in different constants. 
For instance, one obtains 

^ = R^. (1.5) 

fi a 

The sign of R may differ depending on the derivation, but researchers typically report 
\R\ of between 30 and 40 (Calmet & Fritzsch, 2002; Calmet, 2002; Langacker et al., 2002; 
Dent, 2008) for both GUTs and string theories. Dent (2008) noted that quite a wide range 
of proportionality constants can be obtained, however. In the circumstance where no 
variation in /i or a has been seen, then this relationship is of no practical use. However, 
in the event that variation is seen in either fi or a, one can then use this relationship 
to potentially falsify an apparently quite wide class of theories. Nevertheless, it also 
seems possible to generate smaller values of R, although this requires fine-tuning of the 
unification model that many consider to be unnatural (Dine et al., 2003). We return to 
this proportionality later. 

1-4.3 Mach's principle 

An interesting argument has recently emerged based on Mach's principle. Mach's principle 
asserts that the local laws of physics are somehow due to non-local interaction with all 
the other matter in the universe. This has been postulated to explain why a unique frame 
exists with zero angular momentum ("if something is rotating, what can it be rotating 
with respect to other than the rest of the universe?"). Gogberashvili & Kanatchikov (2010) 
considered a simple Machian model in which they estimate the gravitational energy of 
baryons and the electromagnetic energy of radiation. They identified the total Machian 
energy of all particles with that of dark energy, and concluded that the fine-structure can 
be defined in terms of cosmological parameters by 

a = 47rnl^, (1.6) 

where is the dark energy density, fif, is the baryonic density and flj. is the radiation 
density, all expressed as ratios of the critical density. Using cosmological data and as- 
sociated errors, they concluded that a ~ (7.5 it 0.4) x 10~^, which is ~ 133 ±7 — 
surprisingly close to the present value of ~ 137. At the time of writing, this work 
was unrefereed and so we are unsure of its import. Nevertheless, Mach's principle has 
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remained appealing to many, even if it is rather loosely defined, and the serious return of 
Machian arguments would be an interesting turn for physics. 

1-5 Structure of this work 

The goal of this work has been to investigate the variation of two important fundamental 
constants using quasar absorption lines. This work is divided into four primary sections: 

1. In chapter 3, we investigate possible changes in the proton-to-electron mass ratio, 
fi = nip/me, using UV molecular hydrogen transitions at high redshift. 

2. In chapter 4, we use redshifted metal line absorption in quasar spectra to investigate 
the possibility that the fine-structure constant, a = /{Aire^hc), has changed. Both 
of these chapters make use of data obtained with the Ultraviolet and Visual Echelle 
Spectrograph (UVES), mounted on the Very Large Telescope (VLT), in Chile. In 
chapter 5, we consider potential systematic errors for our analysis of Aa/a. 

3. In chapter 6, we consider the /x and a results in the context of each other, and in 
the context of constraints from other methods. 

4. A critical concern when modelling quasar absorption lines is whether the optimisa- 
tion algorithm used to fit the models to the spectral data has converged, and whether 
it returns sensible errors. Although the reliability of vpfit (the program we use) 
has been confirmed through simulations, recent publications draw attention to the 
need to ensure that error estimates are accurate. Moreover, it would be useful to 
confirm in specific cases that vpfit produces appropriate parameter estimates and 
uncertainties, rather than relying on ensemble results from synthetic spectra. Thus, 
in chapter 7 we apply Markov Chain Monte Carlo (MCMC) methods to confirm both 
that VPFIT does indeed converge and that the uncertainty estimates it provides are 
reasonable. 

Some of the methods and methodology are common to the analysis of both /x and a. We 
discuss these in chapter 2. 

Finally, in chapter 8 we present our conclusions. Below, we give non-quasar constraints 
on A/i//i and Aa/a. We give quasar constraints on A^//i and Aa/a in chapters 3 and 4 
respectively. 

1-6 Non-quasar constraints on Ayu//i and Aa/a 

The most precise present-day bounds on variation of /i and a derive from atomic clocks. 
This method relies on the fact that different transitions have different sensitivities to a 
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variation in /i and a. By comparing two clocks which use transitions with significantly 
different sensitivities to a change in /i or a, one can derive strong bounds on the present-day 
rate of change of these constants. Improved precision is obtained by utilising transitions 
with sensitivity coefficients of greater difference, by running the experiment for longer, or 
by building more precise atomic clocks. 

Atomic clock measurements only constrain the present time rate-of-change of fundamental 
constants. To fully investigate the universe, we must turn to observations of the Solar 
System and elsewhere. The cosmic microwave background (CMB) allows us to derive 
constraints on A/x/^u and Aa/a at z ~ 1100. Big Bang nucleosynthesis, noted earlier, 
allows us to probe the first minutes after the Big Bang. Ultimately all of these avenues 
are of interest, because they allow us to probe most of the history of the universe, albeit 
with differing sensitivities. Very loosely, we can probe the temporal evolution of certain 
constants extremely well at the present day (i.e. a fractional change at the ~ 10~^^-10~^^ 
per year level at 2 = 0), reasonably well through to redshifts of a few (i.e. at the ~ 10~^ 
level) and at the 10"^ level at the CMB era (z ~ 1100). 

1-6.1 The proton-to-electron mass ratio, [i 
1-6.1.1 Atomic clocks 

A strong present-day direct bound on variation of \x is obtained through comparison of the 
molecular transitions in SFg to the Cs standard, yielding = (—3.8 it 5.6) x 10~^^yr~^ 
(Shelkovnikov et al., 2008). The combination of a series of atomic clock experiments from 
Sr, Hg"*" , Yb^ and a H maser yields A^//i = (—1.6 it 1.7) x 10~^^ per year (Blatt et al., 
2008)^. Blatt et al. (2008) similarly conclude that there is no coupling of a, /i and the 
light quark mass to the gravitational potential at the present level of accuracy. Salumbides 
(2009) noted that use of Sr2 transitions and the inversion transitions of NH3 may be able 
to probe variation in /i at the level of 10^^^ per year in the near future. 

Shaw & Barrow (2010) used the fact that the ratio of optical to Cs frequencies are sensitive 
to changes in although there is a degeneracy with a. They combine the Yb"*" measure- 
ments of Peik et al. (2004), and other data to conclude that the gravitational coupling 
between [i and gravity is = (3.9 ± 3.1) x 10^^. 

1-6.1.2 Galactic ammonia 

Molaro et al. (2009) investigated potential variation of within the Milky Way by searching 
for radial velocity offsets between the inversion transitions of NH3 — which are sensitive to 
a change in — and control molecules CCS and N2H^, and concluded that | A/x//x| < 10^''. 

^This work uses pu = m^lrrip and therefore a sign reversal is required. 
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However, they noted positive velocity shifts between the Hne centres of NH3 and the two 
other molecules and noted that if this is was due to a change in ^ it would imply a spatial 
variation of A^/^ ~ 4 x 10~^. They also noted that this would conflict with atomic 
clock experiments by about five orders of magnitude, thereby requiring chameleon-type 
theories, in which the values of the constants have a dependence on the local matter 
density. However, we note that the observations are of emission lines. In emission, there 
are significant optical depth effects, where the transitions may arise from significantly 
different places both in the radial direction as well as in the spatial direction. The beam 
size for the Green Bank Telescope (GBT) for their observations corresponds to 0.04pc at 
the distance of the Perseus cloud observed. Due to the significant potential systematics 
intrinsic to emission observations, we would be extremely cautious about interpreting the 
results of Molaro et al. as evidence for spatial variation in /i. 

1-6.2 The fine-structure constant, a 
1-6.2.1 Atomic clocks 

The combination of a series of atomic clock experiments from Sr, Hg^, Yb"*" and a H maser 
yields a/a = (-3.3 ± 3.0) x lO'^^ yv'^ (Blatt et al., 2008); Blatt et al. also concluded 
that a, fj, and the light quark mass do not couple to the local gravitational potential at the 
current experimental limit. The experiment of Rosenband et al. (2008) compared the ratio 
of single-ion Al"^ and Hg"*" optical clocks to conclude that a/a = (—5.3 it 7.9) x 10""'^'' yr^"*^ 
— an extremely precise constraint. 

Dysprosium displays two nearly degenerate energy levels of differing sensitivity to Aa/a; 
the resonance enhances the sensitivity coefficients of the transitions. Cingoz et al. (2007) 
utilised these transitions to find that Aa/a = (—2.7 ± 2.6) x 10^^^ per year. 

Shaw & Barrow (2010) searched for annual variation in the results of Rosenband et al. 
(2008) to examine the coupling constant between a and gravity, ka, and concluded that 
= (-5.4 ±5.1) X 10-8. 

1-6.2.2 Direct solar system observations 

lorio (2010) considers the effect of a varying speed of light^ on the precession of the 
perihelion of the orbits of various inner solar system planets, and concludes that c/c = 
(0.5 lb 2) X 10"''' yr"^ over the past century based on astronomical observations. 

In synchrotron accelerators, when electrons scatter off a laser beam whilst in flight they 
emit a spectrum of radiation. The lower edge of the spectrum, the Compton Edge (CE), 

^It is assumed that e and h are held constant. 



14 



1-6. Non-quasar constraints on A^/^ and Aa/a 



15 



depends on the velocity of light. Gurzadyan et al. (2010) used measurements of the CE 
in the GRAAL beam-line at the European Synchrotron Radiation Facility (ESRF) in 
Grenoble to constrain velocity anisotropy in the speed of light, Ac/c. Using data from 
2008, they constrain isotropy in the velocity of light to Ac/c < 10~^^. If one assumes 
constancy of e and h this implies^ that Aa/a < 10~^^. 



1-6.3 The weak equivalence principle 

The weak equivalence principle (WEP) states that trajectory of a free-falling body under 
gravity is independent of its composition (Dent, 2008). This is equivalent to requiring that 
inertial and gravitational masses are identical. The Einstein equivalence principle (EEP) 
is a stronger statement than the WEP. The EEP requires: i) that the WEP holds; ii) that 
the outcome of any non-gravitational experiment conducted in free-fall is independent of 
the velocity of the experiment (local Lorentz invariance, or LLI); and in) the outcome of 
any non-gravitational experiment conducted in free-fall is independent of the location and 
time of the experiment (local position invariance, or LPI) (Dent, 2008). Variation of the 
fundamental constants would imply a change in the composition of the object; the mass 
of nucleons is in substantial part due to coupling constants to fundamental forces, and 
therefore variation of fundamental constants would change the mass of an object, thereby 
causing a violation of the both the second and third points above, and thus the EEP 
(Salumbides et al., 2006). The strong equivalence principle (SEP) says that the outcome 
of any experiment (gravitational or not) in a free-falling reference frame is independent of 
the position in space-time. Violation of the strong equivalence principle would manifest 
as a fifth force (Dent, 2008). 

The WEP has been stringently tested by both Eotvos-type torsion balance experiments 
(Schlamminger et al., 2008) and the measurement of free-fall of the Moon via lunar laser 
ranging experiments (Williams et al., 2004). These measure the Eotvos parameter 

V = 2'^, (1.7) 
ra + n 

where and are the ratios of the gravitational mass to the inertial mass of particles 
a and h respectively. Both experiments given above yield constraints on violation of the 
equivalence principle at the 10~^^ level. Schlamminger et al. (2008) calculated that space- 
fixed differential accelerations in any direction are limited to less than 8.8 x 10^^^ ms~^ at 
the 95% confidence level. Tobar et al. (2010) compared various hydrogen masers to a cryo- 
genic sapphire oscillator for sidereal and annual modulations of the oscillator frequency, 
and constrain both (and thus LLI and LPI violation) at the few parts in 10*^ level. 



'Aq/q = —Ac/c if e and h are assumed to be fixed. 



15 



16 



1. Introduction 



1-6.4 The Oklo natural nuclear reactor 



It was discovered in the 1970s that a uranium deposit at Oklo, in Gabon, showed de- 
pletion of i-gia^ive to the natural abundance, as well as anomalies in the abundance 
of isotopes of other elements. The observed abundances are explained by the operation 
of a water-moderated natural nuclear fission reactor about 1.8 billion years ago (Naudet, 
1974; Maurette, 1976). This effect was made possible by the relatively higher isotopic 
abundance of ^35^ then (about 3.7%) compared to today (about 0.72%). 

The production of ^^^Sm by neutron capture depends strongly on a ~ O.leV resonance. 
Thus, the ^^^Sm/^^'^Sm abundance ratio today constraints variation in fundamental con- 
stants at the time of operation of the reactor. Shlyakhter (1976) estimated the shift of 
the resonance due to variation in a. Damour & Dyson (1996) claimed that the mea- 
sured abundance ratio leads to the constraint —0.9 x 10"'^ < Aa/o < 1.2 x 10"''. Fujii 
et al. (2000) used new samples from Oklo to find Aa/a = (-0.04 ± 0.15) x lO"'^. Gould 
et al. (2006) claim -0.11 x 10"' < Aa/a < 0.24 x lO"'^. Petrov et al. (2006) give 
-0.56 X lO"'^ < Aa/a < 0.66 x lO"'^. 

However, Flambaum &: Wiringa (2009) noted that the shift of the resonance induced is 

(1.8) 



Ai,«10(^-0.1^) MeV, 



a 



where Xq = mg/Aqco and niq is the light quark mass. As such, the shift in the resonance 
is dominated by the first term, and so the Oklo reactor measurements cannot give any con- 
straint on Aa/a without the wholly unjustified assumption that AXg/Xg = (Flambaum 
k Berengut, 2009). Flambaum & Berengut (2009) used the findings that \AE\ < O.leV 
(Fujii et al., 2000; Gould et al., 2006; Petrov et al., 2006) to give the constraint 



AXn 



Xr, 



0.1- 



Aa 



a 



< 4 X 10~^. 



(1.9) 



If one assumes linear temporal variation this leads to 



Xr. 



< 2.2 X 10"^^ yr-^ 



(1.10) 



1-6.5 Cosmic microwave background (CMB) 

The CMB provides an investigation of the variability of the fundamental constants at 
very high redshift, z ~ 1100. This gives the longest practical baseline over which the 
fundamental constants can be examined via electromagnetic radiation at present, as the 
universe is opaque to light at higher redshifts. Neutrinos can in principle be used to probe 
higher redshifts, but this remains well beyond practical examination at present. The 
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CMB is sensitive to the variation of a, as the strength of the electromagnetic force affects 
the Thomson scattering cross-section and the ionisation fraction (Salumbides et al., 2006; 
Uzan, 2003). Increasing a, for instance, increases the amount of power at small scales in 
the CMB power spectrum (Kaplinghat et al., 1999; Hannestad, 1999). The constraints are 
unfortunately only at the 10~^ level, although this may improve with time. The situation 
is not assisted by strong degeneracies between different parameters. 

Landau & Scoccola (2010) used 7 year WMAP data and a model-free^ approach to find 
Aa/a = —0.014 it 0.007 when only a variation was considered, Aa/a = —0.014 it 0.009 
and Ame/TTie = —0.001 ±0.035 when both a and rrie were allowed to vary and Ame/nie = 
—0.036 lb 0.025 when only me was allowed to vary (nip was held constant for these purposes 
due to a strong degeneracy with the baryon mass density and number density) . Nakashima 
et al. (2010) allowed variation of nip and assumed that the variation in different coupling 
constants is driven by a single scalar field (the dilaton), and obtained —8.28 x 10~^ < 
Aa/a < 1.81 x 10^^ (95% confidence) and 0.52 < A/i//x < 0.17 (95% confidence) in an 
analysis where both a and fi could vary. The substantial increase in the error bar on Ajj,/ ji 
(compared to Anie/nif.) as a result of allowing nip to vary is clearly seen. 

1-6.6 Other 

We note with amusement the April Fool's Day spoof article on arXiv claiming to detect 
a temporal variation in tt through examination of historical calculated values (Scherrer, 
2009), and thank the author for a good laugh. This "result" was widely circulated on 
the Internet through popular science websites (e.g. New Scientist) and blogs. Some com- 
mentators did not seem to realise the nature of the paper. This demonstrates both that 
a fairly wide readership is interested in the variation of fundamental constants, and also 
that citing the results of papers without reading them can lead to much embarrassment 
for those involved. At the time of writing, NASA's ADS records no refereed citations to 
this article, and therefore we kindly supply Robert Scherrer with one through this work. 

1-7 Quasar absorption lines 

The discovery of quasars (Schmidt, 1963) — first observed as star-like radio-loud^ objects 
— rapidly led to a intense study of the absorption spectra they generate. Schmidt (1963) 
observed that 3C 273 (which has an apparent magnitude of about 13, but an absolute 
magnitude of about —27) exhibited a redshift of 0.16, implying recession at ~ 47,000 
km/s. The mechanism through which quasars generate power for the observed luminosity 

^Here, model-free means that no specific model for variation of a or me is considered. 
^It is now known that not all quasars are radio-loud. 
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and redshift was initially unknown. In particular, the light curves of quasars were ini- 
tially observed to vary on the timescale of years, implying that the power source must be 
contained within parsec-sized regions (Greenstein & Schmidt, 1964). Known power gener- 
ation mechanisms were insufficient to explain the observed luminosity unless the objects 
had lifetimes of < 10^ years (Greenstein & Schmidt, 1964). However, Hawkins (2010) 
recently claimed that quasar light curves do not show the expected time dilation, and 
therefore that intrinsic variability may be due to other factors, such as microlensing. The 
microlensing explanation seems difficult to support, as the required population of compact 
galactic halo objects is incompatible with the results from the MACHO project (Alcock 
et al., 1997; Hawkins, 2010). 

Resolution of the power source conundrum came with the finding in the 1970s that black 
hole accretion disks could generate sufficient amounts of power to match observed luminosi- 
ties (Shakura & Sunyaev, 1973). Although it was unknown originally how disk viscosities 
could be sufficiently high to generate the requisite angular momentum transfer, it is now 
clear that magnetohydrodynamical stresses are crucial (Blaes, 2007; Kuncic & Bicknell, 
2007). A fit to 60 observed quasar and active galactic nuclei (AGN) spectra indicated 
that the observed power law continuum is well modelled by a geometrically thin, optically 
thick black hole accretion disk (Sun & Malkan, 1989). For the purposes of our work, the 
mechanics of power generation are not relevant. Instead, we utilise the fact that quasars 
are the brightest continuous sources known in the universe. Their extreme luminosities 
allows observations at high redshifts, which can probe more than 90 percent of the time 
back to the Big Bang. 

Gunn & Peterson (1965) and Bahcall & Salpeter (1965) suggested that absorption along 
the line of sight to high redshift objects could be detected by optical observation of red- 
shifted UV absorption lines caused by intergalactic H I. Lynds (1971) suggested that the 
"forest" of absorption lines almost exclusively blueward of the quasar Lyman-a emission 
line was due to Lyman-a absorption by intervening H i; this has since become known as 
the Lyman-a forest. Becker et al. (2001) claimed detection of a complete Gunn-Peterson 
trough, where zero flux is observed, in observations of a z = 6.28 quasar. See also Djor- 
govski et al. (2001); Fan et al. (2003, 2006). Murphy (2002, and references therein) noted 
that the lower column density forest lines probably arise from "the large-scale filamentary 
and sheet-like structures in which galaxies are embedded". He also noted that the higher 
column density forest lines probably arise from galaxy halos, or galaxies themselves. 

Quasar spectra also display metal-line absorption (e.g. Burbidge et al., 1966; Stockton & 
Lynds, 1966), which may be due to clouds either associated with the quasar host galaxy 
itself or at some other (cosmological) distance along the line of sight. Investigation of 
the metal absorption complexes at high resolving powers reveals dense and complicated 
velocity structures. Metals in this context refer to any element more massive than helium. 
We show in figure 1.1 a schematic representation of a quasar spectrum, and highlight the 
characteristics of metal absorption and Lyman-a absorption. Although the high redshifts 
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of these absorbers imply they are at cosmological distances, it is reassuring that in many 
cases the galaxies with which the absorbers are associated can be identified through direct 
imaging (see for instance Zych et al., 2007). 




4000 5000 6000 

Observed Wavelength [Angstroems] 

Figure 1.1: Schematic overview of a quasar spectrum. The emission line marked "Lya" at 
A ~ 4950A is due to Lyman-a emission by the quasar. Bluewards of the Lyman-a emission 
peak is a dense series of absorption lines — the Lyman-a forest — caused by absorption 
by intervening H I along the line of sight to the quasar. Clouds with sufficiently high H i 
column density display damped wings, and are known as Damped Lyman-a absorbers if 
the H I column density is greater than 2 x 10^*^ cm~^. Other H i absorbers — Lyman 
limit systems (LLSs) — still have sufficiently high column densities, of A'^(H i) > 2 x 10^''' 
cm~^, to cause a substantial drop in the transmitted quasar flux below the Lyman limit 
(at ~ 911. SA in the rest frame of the absorber). A LLS is indicated in this system by the 
"distant galaxy" and absorption at A ~ 4250A. Metal lines are often observed redwards 
of the Ly-a emission peak, indicated here by the narrow absorption lines corresponding 
to Ni II, Si II, C IV, Fe ii, Al ii and Al iii (all in black text). These are due to metal line 
absorption along the line of sight to the quasar. Metal lines also fall in the Ly-a forest, 
but are often observed out of the forest simply because some transitions possess rest 
wavelengths significantly longer than the 1216A Lyman-a line. These metal lines prove 
useful to search for a change in a (chapter 4). The redshifted transitions of molecular 
hydrogen, which can be used to search for a change in n (chapter 3), all possess rest 
wavelengths shorter than 1216A, and therefore are observed only in the Lyman-a forest. 
All absorption and emission is observed with respect to an underlying power law spectrum, 
indicated by the dashed red line. Diagram by Michael Murphy, used with permission. 
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1-7.1 Quasar absorption lines & fundamental constants 

As the absorption lines displayed in the spectra of quasars occur as a result of gas clouds 
at cosmological distances, they can be used as a sensitive probe of physics at the time 
of absorption of the light. Certain transitions are more sensitive to variation in one or 
more fundamental constants, and it is these transitions which have been actively targeted. 
A single transition cannot be used to search for a variation in fundamental constants, 
because the redshift of the absorbing gas cloud is unknown. However, the use of two or 
more transitions with a differing sensitivity to a change in the constant of interest can 
yield a constraint on the constants involved, as the redshift is then no longer degenerate 
with a variation in the constants considered. Metal transitions can be used to search for a 
change in a, whereas molecular transitions (and in particular, molecular hydrogen) can be 
used to search for a change in fi. Importantly, as will be seen in chapters 3 and 4, the way 
in which various transitions would vary if /i or a were different at the time of absorption 
is a relatively unique fingerprint, which is difficult to confuse with or be mimiced by some 
other effect. 
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Common methods &; methodology 



The results of chapters 3, 4, 5 and 7 share much in common — they all derive constraints 
on fundamental parameters through the application of Voigt profile fitting to quasar ab- 
sorbers. Therefore, we outline here methods &: methodology common to these chapters. 

2-1 General comments on Voigt profile fitting 
2-1.1 Voigt profiles and VPFIT 

To fit Voigt profiles to the quasar spectra, we have used the non-linear least squares 
Voigt profile fitting program vpfit^ (Webb, 1987), which was specifically designed for this 
purpose. A Voigt profile describes the observed profile of an absorption line where the line 
is broadened through both Doppler (Gaussian) and Lorentzian broadening mechanisms 
(Armstrong, 1967). In the case of quasar spectra, the former mechanism is due to the 
a combination of turbulent motions of the gas and the non-zero gas temperature, whilst 
the latter is due to the finite lifetime of excited states. Each Voigt profile for a particular 
transition is described by three numbers: the redshift of the transition, z, the column 
density, N, and the velocity width, b (also known as the 6-parameter) . The column 
density is the number of atoms per cm~^, integrated along the line of sight. The b- 
parameter defines the observed width of the transition (where b = \/2cr), and is usually 
specified in km/s. 

VPFIT attempts to minimise where 



/(x)j is the model prediction for the ith flux pixel for a set of parameters x, yi is the 
normalised flux of the ith pixel and cjj is the la statistical uncertainty associated with 



N 



[/(x)i - Vif 



(2.1) 



i=l 



^Available at http : //www. ast .cam. ac .\ik/-rf c/vpf it .html. 
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that flux pixel. The model consists of a series of Voigt profiles. The user must supply the 
number of profiles to fit, as well as reasonable starting guesses for each of them. Clearly, 
each transition must be appropriately identified, which requires identification of the ground 
state, the wavelength of the transition and the atomic mass of the species from which the 
transition originates. 

The optimisation proceeds iteratively until the fractional change in is below some user- 
defined cutoff. One desires that the change in should be much less than unity near 
the optimisation solution (Press et al., 1992). We have chosen this stopping criterion as 
Ax^ < 10~^, which fulfils this requirement even for many thousands of degrees of freedom. 

The optimisation algorithm used by vpfit is described in greater detail in section 7- 
1.2, where we consider not only the mechanics of the algorithm but potential points of 
failure. Also of interest for determining whether the model is a good fit to the data is the 
normalised or x^ per degree of freedom v, defined as Xu = X^/^ (see below for more 
on model selection). The Voigt function is non-analytic, and therefore must be evaluated 
through numerical methods. A good review of different algorithms is given by Murphy 
(2002). 

As a result of the optimisation, vpfit provides parameter estimates on all free parameters, 
as well as statistical uncertainties, which are given by the square root of the diagonal terms 
of the covariance matrix at the purported solution, multiplied by \/xl for the fit. The 
multiplication by \/x^ is a first-order correction to account for dispersion of the spectral 
data about the model which is greater or less than the expected Xu — ^ (Press et al., 
1992). 

VPFIT allows the user to link parameters which are physically related. In particular, 
this means that the redshifts of components can be tied together if they are assumed to 
originate from the same location. Additionally, the 6-parameters of transitions can be 
related. The relationship imposed relates to the choice of broadening mechanism. One 
can impose turbulent broadening (6^ = ^^m-b)' thermal broadening (6^ = 2kT/M, where T 
is the temperature of the cloud and M is the atomic mass of the species in question) or a 
combination of the two effects (6^ = b^^j.^ + fttherm)- species of different atomic mass 

are fitted simultaneously, vpfit can explicitly decompose the 6-parameter into turbulent 
and thermal contributions. However, in almost all cases the two contributions are highly 
degenerate, leading both to very large uncertainties on the individual contributions and 
poor performance of the optimisation algorithm. For our a fits, we work only with the 
turbulent and thermal limiting cases. 

2-1.2 Model selection 

In fitting the quasar spectra, the objective is to produce a model which provides a physi- 
cally realistic, statistically acceptable model of the observed absorption features. Almost 
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all absorption features display departures from that expected for a single Voigt profile, 
thus necessitating the use of multiple Voigt profiles ("velocity components") to achieve a 
statistically acceptable fit. Unfortunately, there is no way of knowing a priori how many 
components are required to obtain a statistically acceptable fit. The process of mod- 
elling the observed structure amounts to adding components until a physically realistic, 
statistically acceptable fit is achieved. 

We have three criteria for a statistically acceptable fit: 

1. Xu ~ 1- ^ statistically acceptable fit, Xu should be of order unity. This fol- 
lows from the fact that the distribution with u degrees of freedom has mean u. 
However this criterion is not the only one which must be used. Adding components 
until xi <1 only suggests that the dispersion of the data points about the model is 
what one would expect for a reasonable model. Murphy et al. (2008b) demonstrate 
through simulations that, at least for one synthetic spectrum considered, "underfit- 
ting" of spectra may lead to significant bias in estimated values of Aa/a, whereas 
"overfitting" does not seem to induce bias of the same magnitude. We are therefore 
particularly cautious about underfitting spectra. 

2. Best fit possible. Fitting components until ^ 1 does not mean that the consid- 
ered model is the best one, only that it might be a reasonable one. x^ fitting is a 
maximum likelihood method, and under the maximum likelihood method one must 
choose whichever model best explains the data. This means that if one can find a 
model which reduces x^ more than would be expected by chance, this model should 
be preferred. 

A rigorous way to proceed in this fashion is to perform a statistical significance 
test on every component added (for example, the F-test). This process is not only 
laborious, but does not allow the comparison of multiple models simultaneously. To 
remedy this, certain heuristics are available which tend to lead to reasonable choices. 
A primary method utilised by many practitioners is to try to find the model which 
minimises xt- adds a component, and xt increases, this suggests that the 

extra component is not supported by the data. In model selection, parsimony is 
valued — one should attempt to choose whichever model best explains the data, in 
the simplest fashion. 

Other methods are available which penalise free parameters more or less strongly. 
We have chosen to use the Akaike Information Criterion (AIC) (Akaike, 1974), de- 
fined as AIC = x^ + 2p where p is the number of free parameters. When comparing 
two models, whichever model has the lower AIC should be preferred. The AIC 
is derived by approximately minimising the Kullback-Leibler entropy (Kullback &; 
Leibler, 1951), which measures the difference between the true distribution and the 
model distribution. In fact, the AIC is only correct in the limit of large N/p (where 
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N is the number of data points fitted), which is generahy not true for our fits. Thus, 
we use the AIC corrected for finite sample sizes (Sugiura, 1978), defined as 



A significant advantage of the AICC is that it allows the comparison of multiple 
models simultaneously, or two models which are not nested. If several competing 
models are being considered, one chooses the model which has the lowest AICC. 
The actual value of the AICC is not important; only relative differences matter. 
The AICC is interpreted according to the Jeffreys' scale (Jeffreys, 1961; Liddle, 
2007) where AAICC > 5 is considered strong evidence and AAICC > 10 is consid- 
ered very strong evidence (this corresponds to odds ratios of approximately 13:1 and 
150:1 against the weaker model). 

Another commonly used information criterion is the Bayesian Information Crite- 
rion (BIC), introduced by Schwarz (1979), defined as 



The BIC is obtained by approximating the Bayes factor (Jeffreys, 1961), which gives 
the ratio of the posterior odds of one model compared to another. For N > 8 
(i.e. in all practical circumstances), the BIC penalises free parameters more strongly 
than the AICC. Liddle (2007) provides a good summary of the AIC, BIC and other 
information criteria. Unfortunately, there is no easy decision as to which criteria 
is better. Burnham & Anderson (2002) prefer the AIC, but note that the BIC is 
justified whenever the complexity of the model does not increase with the size of the 
data set. This is not true in the case of quasar absorption line fitting — although 
one can increase the statistical precision of the data through longer observations, 
a combination of seeing and the light collecting ability of the telescope limits the 
practical resolving power. This means that the number of pixels which sample an 
absorption feature of interest is limited. Moreover, although the density of fitted 
components varies somewhat depending on the situation under consideration, in 
general the model complexity scales roughly with the amount of spectral data fitted. 
For these reasons, we use the AICC. 

3. No long range correlation of residuals. When fitting, one must consider the degree 
of correlation of the normalised (standardised) residuals, rj, of the fit (where rj = 
[data — model] /error) . It is clearly possible to achieve ~ 1 ^^'^ Y^t have long 
range correlations in the residuals (i.e. a situation where many pixels systematically 
deviate from r = 0, over the range of a few to tens of pixels). Despite the fact that 
~ 1, this indicates that the fit is unlikely to be adequate. An explicit calculation 
of the chance probability can be made using the well-known Wald-Wolfowitz runs 



AICC = x^ + 2p + 



2p{p + 1) 
{n — p — 1) 



(2.2) 



BIC = +plniV. 



(2.3) 
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test^ (Wald & Wolfowitz, 1940), although this is unnecessary in most cases. In 
general, adding components which removes significant correlations of the residuals 
also reduces the AICC, and we accept the addition of components which decreases 
the AICC. 

Thus, in fitting the observed absorption profiles, we attempt to obtain a fit which has 
~ 1 and the minimum AICC possible and no substantial correlations of the residuals. 
However, we treat with caution any fitted component which seems to improve the AICC 
significantly but seems physically implausible. This is possible where unremoved spikes 
exist in the data (for instance, as a result of uncleaned cosmic rays). When we fit metal 
lines to search for Aa/a, the use of many transitions of differing optical depths allows 
one, in most cases, to reliably fit narrow lines. However, problems can emerge when 
fitting forest data along with molecular hydrogen data to investigate A/i//i. As will be 
seen in chapter 3, we only use the H i A1215.7A transition to fit the observed structure 
in the forest. By using only a single transition to fit the forest data, it is possible to fit 
uncleaned noise spikes. The fact that a noise feature has been fitted can generally be 
determined a posteriori as the component required to fit the noise has a velocity width 
much smaller than the instrumental resolution (generally b < lkms~^). Additionally, the 
errors on these parameters are very large (for the 6-parameter, many times larger than 
the value of b) . 

In assessing whether any particular region of the spectrum is adequate, there are two rough 
considerations: i) are the magnitudes of the residuals too large or too small? (this relates 
to the test); and ii) are there long range correlations in the residuals? (this relates to 
the runs test). If the RMS of the normalised residuals is ~ 1 and there are no long range 
correlations, the model is likely to be adequate (though not necessarily optimal). 

Unfortunately, there is a degree of subjectivity to Voigt profile fitting, especially in equiv- 
ocal cases where the signal-to-noise ratio is low and/or the line widths are close to the 
instrumental resolution. This is difficult to avoid simply because the Voigt profile decom- 
position is not unique. 

In the case of H2, for A/i//i, and given the large number of transitions used to determine 
the H2 structure, the quantity of data is sufficiently high that it is extremely unlikely 
one can subjectively bias A/x/// through choice of the Voigt profile model. In the case of 
Aa/a, one may theoretically be able to introduce some bias into the value of Aa/a for a 
particular absorber, although we regard this as extremely difficult to do in practice. The 
response of Aa/a to the addition of components is not obvious except in the simplest 
of cases, and therefore any attempt to systematically bias Aa/a would not only require 
detailed calculations in each case, but would probably be unable to be supported by the 
data in any event. The consequence of this, and the fact that the absorption profiles differ 

^This is often known as just the "runs test". 
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from absorber to absorber, means that any error introduced through a failure to select 
the correct model will be random from absorber to absorber, and therefore will average 
out when considering the results of a statistical ensemble of absorbers. The only way to 
significantly bias Aa / a over an ensemble of observers through the model selection process 
is by using the numerical value of Aa/a to inform the model selection process — clearly 
a very dangerous way to proceed. We do not use the value of Aa/a to guide our choice 
of model, and therefore no bias should be introduced as a result of our model selection 
methodology. 

2-2 Data pipeline problems 

The MIDAS extraction routine (part of the UVES pipeline) appears to incorrectly estimate 
the errors associated with the flux data points in the base of saturated lines. In particular, 
the dispersion of the flux data points is too large to be accounted for by the statistical 
error. Fitting a straight line through the base of saturated lines typically produces xt ~ 2. 
The problem is somewhat more noticeable in the blue end of the spectra. Although it is 
difficult to determine precisely what happens in regions of low, but non-zero flux, we 
believe that the errors there are also underestimated. The effect of this is to give falsely 
high precision on any quantity derived from these data points (including Afj,/ fi or Aq/q). 
Additionally, one cannot fit plausible models to data involving regions of low or negligible 
flux; to achieve a reasonable xi these regions one would need to fit very large numbers 
of unphysical components. 

When fitting the H2 spectra initially we adopted one approach to adjusting for this problem 
(section 2-2.1 below), but for our second of Q0528— 250 in section 3-6 and for Aa/a we 
adopted a more automatic approach (section 2-2.2 below). 

2-2.1 Correcting error arrays through an approximate functional form 

One way to attempt to correct the problem in the base of saturated lines is to try to 
approximate the functional form of the problem. The errors in the continuum are accept- 
able, whereas those in the base of lines are not, so presumably there is some monotonically 
increasing function from a normalised flux of 1 to which describes this behaviour. If one 
knew the functional form, one could increase the error estimates, thereby removing the 
problem. Investigation of the problem suggests (R. F. Carswell, priv. communication) 
that the functional form 

i/t 

(2.4) 

might be useful in correcting the problem, where ej is the error on the ith normalised flux 
pixel, di is the normalised flux value at that point, Cj is the value of the continuum at that 



a + b [ 1 — max 0, min 1, 



di 
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point and a, b, s and t are user-defined constants to emulate the desired behaviour. R. F. 
Carswell suggested using s = t = 4. Clearly this form will be incorrect, but in the absence 
of any other information a guess of this sort is all that is possible. One then chooses a = 1 
to leave errors in the continuum unchanged and b such that (a -|- 6)^/* = / where / is the 
factor by which errors should be increased in the base of saturated lines. 

2-2.2 Correcting error arrays through consistency checks with the input 
spectra 

Another method of correcting this problem, and other problems arising from inconsisten- 
cies between combined spectra, is by adjusting the error arrays to account for the degree 
of inconsistency of the spectral combination. When individual exposures are co-added to 

create a combined spectrum using UVES popler, uves popler provides a check on the 

concordance of the different spectra, by calculating a value of Xu each flux pixel in 
the combined spectrum by considering the dispersion of the corresponding pixels in the 
contributing spectra about their weighted mean. For each spectral data point in the com- 
bined spectrum, we take a region of five pixels centred on that point, and take the median 
of the Xu values associated with those five points. We then multiply the error estimate for 
that spectral point by the square root of that median value (that is, cjj — )■ o"j x ^med[x^]). 
This is a first-order correction to the error estimate to ensure that the individual exposures 
are consistent with the weighted mean (Press et al., 1992). Thus, this algorithm provides 
protection against under-estimation of the errors in the base of saturated lines. 

Additionally, this algorithm also provides some protection against other data combination 
problems (such as weak sky emission that differs between exposures or improperly removed 
cosmic rays). However some of these effects have non-zero expectation value (that is, they 
cannot be averaged out with large numbers of exposures), and so data affected by these 
processes should not be utilised. In particular, cosmic rays always contribute excess flux, 
and therefore the impact of including data affected by these cosmic rays would be lessened 
by our algorithm, but the results which would be biased. In the case of A/u and Aa/a, 
although this effect is random from transition to transition and absorber to absorber (and 
therefore cannot systematically bias Afj,/fi or Aa/a over a larger number of systems), it 
is an extra source of uncertainty, which would make our final error estimates larger than 
might otherwise be needed. 
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Chapter 3 

— the proton-to-electron mass 
ratio 

3-1 Introduction 

The proton-to-electron mass ratio, fi, is defined simply as the proton mass divided by the 
electron mass i.e. = nip/ me- Some works define /U = nie/mp, and therefore caution is 
warranted in reading the literature. The current 2006 CODATA recommended value is 
/i = 1836.152 672 47(80) (Mohr et al., 2008), derived from two experiments using Penning 
ion traps. 

3-1.1 The importance of A/i//i 

In the Standard Model, the proton mass is proportional to AqcD) where Aqcd is the 
value of the Landau pole in the logarithm of the running strong coupling constant i.e. 
as ~ l/ln(AQCD''/^c), if the direct ~ 10% contribution from the quark masses is ignored 
(Berengut et al., 2010b). The electron mass, me, is proportional to the Higgs vacuum 
expectation value (vev), vh, if one assumes the Higgs mass mechanism (Coc et al., 2007). 
The Higgs vev determines the electroweak unification scale. Therefore // = mp/rrie depends 
on the ratio Aqcd/uj^. As a result. A///// probes evolution in the strong force relative 
to the electroweak scale. This contrasts with the fine-structure constant, a, which probes 
the strength of the electromagnetic force. 
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3-2 Quasar constraints 

Almost all direct^ quasar constraints on A/x//x rely on the examination of molecular hy- 
drogen transitions, although the inversion transitions of ammonia now provides a strong 
test at moderate (z < 1) redshifts. 

3-2.1 Molecular hydrogen (H2) 

Most known baryonic matter in the universe is hydrogen, found in either atomic or molec- 
ular form (Combes & Pineau des Forets, 2000). Molecular hydrogen transitions fall in the 
far ultraviolet, and therefore cannot be observed from the ground due to the UV cutoff 
caused by atmospheric ozone. The first astrophysical observation of molecular hydrogen 
was made in 1970, using a rocket-launched spectrometer, in the spectrum of the star ^ 
Persei (Carruthers, 1970). The column density ratio of H2 to atomic hydrogen was found 
to be approximately 1:3. The Far Ultraviolet Spectroscopic Explorer (FUSE) satellite 
(Moos et al., 2000) made large numbers of observations of molecular hydrogen routine 
(see for example Shull et al., 2000; Rachford ct al., 2002; Tumlinson ct al., 2002; Richter 
et al., 2003; Rachford et al., 2009). The FUSE mission was concluded in 2007 after fine 
control over telescope pointing was lost. 

The possibility of observing redshifted molecular hydrogen transitions from the ground 
has been known for quite some time. Carlson (1974) ascribed features in the spectrum of 
quasar 4C 05.34 to molecular hydrogen, at a redshift of z = 2.64. Aaronson et al. (1974) 
conducted a search for molecular hydrogen in quasar spectra, and tentatively identified 
molecular hydrogen at z = 2.31 in the spectrum of PHL 957. Lcvshakov & Varshalovich 
(1985) tentatively identified molecular hydrogen at z = 2.811 toward Q0528— 250. This 
identification was correct, and Q0528— 250 forms part of the analysis of this chapter. 

3-2.1.1 The sensitivity of molecular hydrogen to a change in fi 

Thompson (1975) noted that molecular absorption by gas clouds at high redshift along the 
line of sight to quasar sources might reveal variation in fj, over time, and identified molecular 
hydrogen (H2) as a possible tool. Unfortunately, serious examination of this idea had to 
wait some time for the robust detection of H2 at high redshift. Due to the UV atmospheric 
cutoff, one needs to identify H2 absorbers at z > 2 in order to obtain a sufficient number 
of lines in the optical region to make ground-based observations practically useful. Indeed, 
only about a dozen absorbers are presently known which contain the requisite redshifted 

^Constraints on Afi/fi may be obtained through other dimensionless ratios, which are a combination of 
fundamental constants, typically including Aa/a and Agp/gp, where Qp is the proton gyromagnetic ratio. 
However, determination of Ajj,/^ then requires disentangling these combinations of constants. We discuss 
these combinations of fundamental constants in section 6-3.2. 



30 



3-2. Quasar constraints 



31 



H2 lines in their spectrum, and only several of these have yielded strong constraints on 
/^ji/ (J,. The reason that the number of absorbers known is small relates to the way in 
which H2 is produced. H2 is formed in cold clouds, typically via adhesion onto dust grains 
(Ge &: Bechtold, 1999). The low temperature of the clouds means that the clouds must 
be small, and so the chance of obtaining intersections with the line-of-sight to the quasar 
is much smaller than for DLAs. 

Foltz et al. (1988) used the fact that the vibrational component of the energy of a transition 
increases with increasing excited state vibrational quantum number to obtain |A/i/;u| < 
2 X 10~^ from the H2 transitions in the z = 2.811 absorber toward Q0528— 250, using a 
spectrum with resolving power R ~ 5,000. However, Varshalovich & Levshakov (1993) 
noted that different ro-vibrational transitions have a different dependence on the reduced 
mass of the particular molecule. This led to the currently used definition of the sensitivity 
coefficients, presented below. 

For a molecular spectrum, there are three primary contributions to the observed structure, 
all of which scale with the Rydberg energy, but only two of which depend on /i (Thompson, 
1975). Firstly, the electronic energy has no dependence on ^. The vibrational energy 
structure scales as E^^b /^"""^^^i similar to a harmonic oscillator. The rotational term 
scales as -Erot oc similar to a simple rotor. As such, the energy of a particular level of 
the H2 molecule is given by 



for certain constants c in the Born-Oppenheimer approximation (BOA) (Salumbidcs, 
2009). One can derive the sensitivity to a change in /i for differing transitions using 
either ab initio methods, or within a semi-empirical approximation, yielding 



where Aj is the wavelength of a transition under consideration, is the unperturbed value 
and Ki is a sensitivity coefficient which determines the magnitude and sign of the effect. 
The values of Ki are defined in terms of the derivatives of the energy or wavelength of the 
transition with respect to fi, as (Reinhold et al., 2006; Salumbides, 2009) 



where and Eg are the energies of the excited and ground states respectively. For 
useful H2 transitions Ki is typically in the range —0.02 ^ Ki < 0.05. Early observations 
established that |A/u//i| <^ 1, thereby allowing the use of only the first order term of 
equation 3.2 with good accuracy. 




(3.1) 




(3.2) 




(3.3) 
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We use the Ki values from Ubachs et al. (2007), who use a semi-empirical treatment 
based on the Dunham expansion of the energy levels of the H2 molecule. They include an 
adiabatic correction to account for the contribution to the electronic energy of each state 
from the nuclear mass, which scales as ~ They also account for post-BOA effects by 
accounting for the term in the Hamiltonian which relates to the interaction between the 
nuclear and electronic motion, and give careful attention to the effect this has on H2 level 
crossings. We show in figure 3.2 the values of Ki for a variety of Lyman and Werner series 
H2 transitions. 

Meshkov et al. (2006) have derived Ki values based on ab initio calculations of the H2 
molecule. Ubachs et al. (2007) compare their values of Ki to those of Meshkov et al., and 
note that the deviations, AKi, lie between —2 x 10~^ and 4 x 10~^. Given the totally 
independent method of derivation, this implies that the absolute accuracy of the Ki values 
is better than 5 x 10~^. 

Each H2 transition is described by quantum numbers u and J, which describe the excited 
state vibrational quantum number and the angular momentum of the ground state. The 
Lyman series is described by the X"^S^ B^S^ transitions, and the Werner series by 
the X^E^ —7- C^n„ transitions. An additional letter, P, Q or R denotes the quantity 
AJ = J' — J as —1, and 1 respectively (where J' is the angular momentum of the 
excited state). A useful shorthand notation to describe particular lines is therefore AvBJ, 
where A is L or W for Lyman or Werner, B is either P, Q or R and v and J are as 
described previously. In figure 3.1 we show a schematic representation of the Lyman and 
Werner series in the H2 molecule. The Lyman state is a S state and therefore has a total 
orbital angular momentum of zero, and the Werner state is a H state and therefore has a 
total orbital angular momentum of unity. The selection rules thus impose the following 
constraints on transitions: i) for the Lyman series, A J = ±1, leading to P and R branches 
but no Q branch; ii) for the Lyman series, transitions from J = to J' = —1 are not 
possible, and so there are no Lz^PO transitions, and; Hi) for the Werner series, there are no 
J' = levels, so the lowest transition in the P branch is Wz^P2. We show in figure 3.3 how 
the rest-frame wavelengths of a selection of H2 transitions would differ under variation of 
//. 

The accuracy of the laboratory data for the H2 transitions historically meant that the 
laboratory errors were non-negligible. Significant recent work has rectified this situation, 
such that the error budget is now wholly dominated by non-laboratory factors. The 
current best wavelengths are given in Bailly et al. (2009) and Ubachs et al. (2007), and 
have been collated in Malec et al. (2010) with Ki values, oscillator strengths and damping 
coefficients. 
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Figure 3.1: Schematic representation of the energy levels of the H2 molecule. The left 
side shows the Lyman series, whilst the right side shows the Werner series. The arrows on 
the left side show transitions from the J = level of the ground state to various levels 
in the Lyman band. The arrows on the right side show transitions from various J-levels 
of the ground state to the v = \ vibrational level of the Werner band. Each vibrational 
{u) level is subdivided into states with different angular momentum (not shown). 

3-2.1.2 General comments on measuring A/i//i with H2 

It is widely acknowledged that because the H2 transitions fall in the Lyman-a forest it is 
difficult to model the spectra. Traditionally, researchers have discarded transitions which 
appear to be heavily blended with the forest, and utilised only weakly blended transitions. 
An obvious questions is: how does one decide what is weakly blended? Clearly proceeding 
in this fashion introduces an element of subjectivity into the analysis. A more appropri- 
ate way to proceed is to model the forest explicitly, thereby allowing the uncertainty in 
determining the forest structure to propagate into the uncertainty in determining A/i//i. 

In order to account for the effect of the forest (which provides a background continuum 
against which the H2 absorption occurs), in previous analyses researchers have generally 
fitted a low order polynomial across the H2 transitions to estimate the optical depth due to 
the forest. One can then divide the fiux spectrum by this polynomial continuum estimate 
to obtain a H2 profile to fit. This is particularly obvious in Ivanchik ct al. (2005), where 
many H2 profiles are displayed after division by the polynomial, which masks the presence 
of the forest. 

There are two problems with this method: 
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Figure 3.2: The Ki values of certain Lyman and Werner series H2 transitions. Lyman 
series transitions are plotted as squares, whilst Werner series transitions are plotted as 
triangles. Note that there is a reasonable correlation of wavelength with Ki when consid- 
ering the Lyman series or Werner series individually. The importance of this is described 
in section 3-2.1.4. 

1. Firstly, it appears in the literature that the uncertainty in accounting for the local 
continuum does not propagate into the error in determining A/i/^u. As the forest 
structure is unknown, this uncertainty should be accounted for. Any method which 
does not attempt to account for the uncertainty in determining the forest structure 
must under-estimate the required uncertainty on Afi/fi. Unless one models the 
forest structure appropriately, one cannot tell by how much the uncertainty is under- 
estimated. 

2. Where it is clear that absorption is due to other gas clouds^, one is not making use of 
the physics that generates the absorption. That is, one should model the absorption 
with a series of Voigt profiles in order to obtain a realistic model. A polynomial 
continuum across the observed H2 profile is not constrained to any physical situa- 
tion, and therefore in principle the estimated local continuum for the H2 transitions 
will be incorrect. Conversely, it must be noted that it is often difficult to differen- 
tiate several closely-spaced forest absorption features from an error in determining 
the local continuum, and therefore there is necessarily some error introduced by an 
incorrect model. 



^Usually Lyman-a, although metal lines are found in the forest. 
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Figure 3.3: The exaggerated effect of A;u//i on certain H2 transitions assuming that 
Xi = A^(l -\- KiAfi/ Although it is clear that |A/i//i| <C 1 over most of observable time, 
this plot shows the effect of A/i/// ~ 1. The interpretation of the label for each transition 
is given in the text of section 3-2.1.1. Note the presence of certain transitions which are 
relatively insensitive to /u variation (e.g. L2R0), as well as transitions which shift both 
to longer (W4P4, L12R0, L5R3) and shorter (W0P3, L0P3) wavelengths. Crucially, this 
"fingerprint" is rather unique, which therefore makes the measurement resistant to a wide 
range of systematic effects. 

Over a large number of molecular hydrogen transitions, one does not expect that 
the use of a polynomial continuum to estimate the optical depth of the forest in the 
vicinity of the H2 transitions will introduce a significant error into the determina- 
tion of Afi/ fi; the random nature of the forest structure with respect to the H2 line 
profiles means that errors which bias A^u/yU to more positive values should occur as 
often as those which bias A/x/^u to negative values. However, without accurately 
modelling the forest, one cannot tell how legitimate this argument is or not, and 
what the associated error introduced by proceeding in a more simplistic fashion is. 

It is for these reasons that we have modelled the Lyman-a forest concurrently with the 
H2 transitions. 

We show example H2 transitions from Q0405— 443 in figure 3.4. This figure clearly shows 
the complexity of the forest, and the necessity in modelling the forest simultaneously with 
the Lyman-a transitions if one wants to ensure that the H2 line positions are accurately 
determined. 
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Another concern for the measurement of A/i//i using molecular hydrogen is simply the 
paucity of known sources. In table 3.1, we give a list of currently known H2 sources at 
sufficiently high redshift that investigation of A/i//i is potentially feasible, with various 
references which may be of interest to the reader. The relative lack of H2 sources is prob- 
lematic because one cannot then use the consistency of many results to check whether the 
uncertainties in individual measurements are correct. If one has many A/x//x results, one 
can use the test (under some model) to determine whether the results are statistically 
consistent. Inconsistency between the results is indicative of either the wrong model or 
under-estimated uncertainties for the individual measurements. More importantly, how- 
ever, is that if fi varies with time and space then many different measurements of ^fJ-f^J- 
in different times and places are needed to map out the evolution of /j,. 

3-2.1.3 How to measure Afi/fi 

For a gas cloud at redshift z, one can relate the observed wavelength to the laboratory 
wavelength as 



The redshift of the cloud must determined simultaneously with Afi/fi. Afx/fi is not de- 
generate with redshift provided that at least two transitions of differing Ki are used. In 
principle, accurate knowledge of the observed wavelengths of the different II2 transitions 
are all that is needed to determine Afi/ /i. However, there are different approaches one can 
take to arrive at a value of A/i//i. There have been two methods used in the literature in 
recent times. These are the reduced redshift method (RRM) and the direct minimisation 
method (DCMM), described below. 

Reduced redshift method (RRM) (see Ivanchik et al., 2002; Reinhold et al., 2006). The 
RRM defines for each transition the quantity 

= ^^ = - = ^.^, (3.5) 
1 -I- ^0 c /i 

where Zj is the observed redshift of the transition and zq is the redshift of a transition for 
which Ki = 0. This quantity is just the velocity difference from the unperturbed value. 
The individual Zi values can be obtained by independent Voigt profile fits to each molecular 
hydrogen transition. From this relationship, a graph of Q vs Ki will thus have gradient 
Afi/fi. Afi/fi can then be determined through standard minimisation of a straight 
line. This method is advantageous in that one obtains a visual relationship between Q 
and Ki — this allows one to check whether outliers exist, facilitating either their removal 
or re-examination of the fit to investigate the reason for the discrepancy. 

Unfortunately, this method is not easily applied in the situation where the II2 absorption 
displays more than one velocity component (where the components overlap). In this case. 




(3.4) 
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one can generate the Ci values, however the values of the components of each transition 
will be correlated. This makes it difficult to analyse a graph of Q vs Ki with standard 
minimisation, as minimisation assumes that all data points are independent. In 
principle, one can use Generalised Least Squares — which allows for correlated errors — 
to analyse this situation, but this has not been applied in the literature. The RRM is 
also a summary method, assuming that a table of redshifts and associated uncertainties 
contain all the information needed. Although this makes calculation easy, it does not 
operate directly on the spectral data; ideally one would prefer to work directly with the 
spectral data rather than intermediate quantities. 

Direct minimisation method (DCMM) (King et al., 2008; Malec et al., 2010). In the 
DCMM, one assumes that all transitions arise from the same cloud and therefore the same 
redshift. In the case of multiple components, one assumes that corresponding components 
in each transition arise from the same redshift. One then perturbs the rest wavelengths 
as A-* — ^ A-* [1 + i^j(A^//i)], and then finds the value of A///// which minimises x^. The 
value of A/i//x which minimises x^ therefore gives the best-fitting value of Afi/fi. One 
can model AfJ-flJ- as an external parameter, in which case one plots x^ vs Ajs/fj,. This 
graph will be approximately parabolic near the minimum, with the location of the 
minimum giving the best fit value of A^/^. In this case, errors can be obtained by finding 
o'Aii/fi such that 

X^ (A^//ibcstfit + CTA/i/fi) - (A/u/)Ubcstfit) = 1 (3.6) 

(Press et al., 1992). Alternatively, Afi/fi can be included as a free parameter in the fit. 
The inclusion of A^u/^u as a free parameter in the fit has the advantage of being significantly 
faster, as for any value of A;u//u the first and second derivatives of x^ with respect to A^u/yU 
are used to search for the minimum value of x^ (Murphy, 2002). Moreover, this method 
significantly reduces the number of free parameters by imposing the physical constraint 
that the transitions should arise from the same location, and therefore redshift. The 
reduction in the number of free parameters should improve reliability as well as allowing 
tighter confidence limits on Afi/fi. The corollary of this is that one loses any explicit check 
on whether an individual transition is consistent with the overall trend (i.e. whether the 
reduced redshift differs greatly from the trend of Q with Ki given by the other transitions) . 

Comparison of the two methods. Although the RRM is appealing because of the simpler 
numerical methods required, the reduction in the number of free transitions with the 
DCMM can be substantial. In particular, the DCMM requires n^^nt — 1) fewer free 
parameters, where is the number of II2 velocity components and nt is the number of 
II2 transitions used. The reduction in the number of free parameters under the DCMM 
acts to improve the stability of the fitting process. In particular, individual transitions 
may have very poorly constrained line parameters, despite the fact that these parameters 
may be well constrained in a joint fit to many transitions. In the RRM method, this 
can cause certain transitions or, particularly, components to be removed during the x^ 
minimisation process, rendering those transitions unsuitable for inclusion in the fit. VPFIT 
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will automatically remove components during the minimisation if their parameters move 
outside certain user-defined boundaries. The two important ones for this scenario are that 
the column densities of transitions must be greater than 10* cm~^ and the 6-parameters 
must be greater than 0.05 km s~^. With the DCMM, the tying of components helps to 
prevent these transitions/components from being removed, allowing for the inclusion of a 
greater number of transitions. 

Another assumption in the RRM is that the errors on the line redshifts are Gaussian. In 
the event where a transition is blended on one side with a forest line, the uncertainty on 
the redshift for the H2 transition will almost certainly be asymmetric. This means that 
the errors on the reduced redshifts, will also be asymmetric (and not Gaussian), 
minimisation of a linear fit to Q vs Ki assumes that the errors are Gaussian (or at least 
symmetric) , and therefore the use of the standard errors from the spectral fitting in a fit 
of vs Ki will only be approximately valid. This problem should not affect the DCMM, 
however. This is because the determination of Afi/fi is derived from the (potential) 
velocity shifts from many transitions. Because of the central limit theorem, A/i/^u should 
be approximately Gaussian. Because A/i//i is determined simultaneously with all the 
redshift parameters, any asymmetry in the uncertainty of individual line redshifts will be 
accounted for when searching for the best-fitting value of A^u/^u. Similarly, the uncertainty 
on A/x//x is determined directly from the curvature of at the purported best-fit, meaning 
that it should be robust. Thus, the estimate of ^^Jb/ ^ derived from the DCMM is more 
likely to be accurate than one derived from the RRM. 

Ultimately, we prefer the DCMM, as it is both faster and more reliable, and works directly 
with the spectral data rather than on intermediate quantities, although we use the RRM 
as a check where possible. 

3-2.1.4 The importance of the Werner series 

Although a good constraint on A^/n is possible using only the Lyman series, it is clear from 
figure 3.2 that Ki is well correlated with rest wavelength for the Lyman series. This implies 
that a simple stretching or compression of the wavelength scale would mimic variation in 
//. The use of the Werner series helps to break this degeneracy to some degree, as for rest 
wavelengths A < 1020A, where the Werner series exists, the Lyman transitions move in a 
significantly different fashion to the Werner transitions. Importantly, for rest wavelengths 
990A < A < IO20A the Wer ner series transitions move in the opposite direction to the 
Lyman series transitions if A/i//i 7^ 0. It will be seen in chapter 4 that the different 
magnitudes and signs of the q coefficients play a similar role in providing robustness 
against a similar stretching or compression of the spectrum when searching for Aa/a (the 
q coefficients are the sensitivity coefficients used, and are analogous to the Ki coefficients 
for /i). 
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3-2.1.5 Previous constraints 

Varshalovich Sz Levshakov (1993) analysed the spectrum of Foltz et al. (1988) of the z = 
2.811 absorber toward Q0528-250 to obtain |A^//i| < 0.005. Varshalovich & Potekhin 
(1995) reanalysed the same spectrum to conclude that |A/x/|u| < 0.002. Potekhin ct al. 
(1998) used new observations of the same system at higher resolving power (i? ~ 14, 000) 
to obtain A///// = (—10 it 8) x 10^^ using the laboratory wavelengths for H2 of Abgrall 
et al. (1993a,b). 

Cowie & Songaila (1995) used a R = 36,000 Keck observation of the z = 2.811 absorber 
toward Q0528— 250 to produce Afi/ fi G [—7, 5.5] x 10~^ (95 percent confidence limits). This 
result was the first to be obtained with the 8-lOm class optical telescopes, which supersede 
the previous ~ 4m class telescopes. The extra collecting area allows spectra to be taken 
with significantly higher i? in a reasonable amount of time. The precision with which Afi/fi 
can be determined increases with the resolving power of the spectrum^. Additionally, 
higher resolving powers are important in attempting to determine the velocity structure 
of the H2 absorbers. The H2 clouds are cold, leading to line widths of only a few km/s. 
With low-R spectra, it is extremely difficult to determine any velocity structure present 
in the absorbers, as it is below the instrumental resolution. A spectrum with R = 36, 000 
corresponds to an instrumental resolution of ^ 8.3 km s^^. With this R, it is possible 
to clearly distinguish the H2 lines from the surrounding Lyman-a forest, and to start 
to resolve detailed velocity structures. Higher resolving powers obviously lead to better 
results. 

Ivanchik et al. (2002) analysed much higher quahty {R ~ 43, 000, SNR ~ 10 to 40 per 
pixel) VLT/UVES spectra of the z = 3.025 system toward Q0347— 383 and the z = 
2.338 system toward Q1232+0815 with the RRM. Using the Ki values of Varshalovich 
k Potekhin (1995) they found that Afi/fi = (5.8 ± 3.4) x 10"^ and A/i//x = (14.4 ± 
11.4) X 10^^ for two systems respectively. A combined regression analysis gave A/i/// = 
(5.7 lb 3.8) X 10~^, where the H2 wavelengths of Abgrall et al. (1993a, b) are used, or 
A/i/^ = (12.5 lb 4.5) X 10"^ if the wavelengths of Morton & Dinerstein (1976) are used. 

A significant potential source of systematic error in all of the above results arises from 
uncertainties in the laboratory measurements of the H2 wavelengths. Ivanchik et al. (2002) 
state the measurement errors in the wavelengths of Abgrall et al. ( 1993a, b) to be ~ 1.5mA. 
The fractional error in the wavelengths is thus of order AA/A ~ 1.5mA/1000A = 1.5 xlO^"^ 
(the H2 transitions have 900A ^ A < 1150A). With AA/A = Av/c this implies a velocity 
uncertainty of ~ 450 ms^"*^. This can be converted into an implied systematic with Av/c ~ 
\AKi\[A^/ with |Ai^j| being the range oi Ki values used. AKi is typically 0.05, thus 
implying a systematic error term of Afi/ ^ ~ 3 x 10~^. However, the difference between 
their two results indicates that the systematic error is larger (Ivanchik et al., 2002). On 

''Assuming that SNR is held constant 
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account of this, there has been considerable laboratory work in recent years to generate 
laboratory wavelengths of sufficient accuracy that they do not contribute appreciably to 
the total error budget. Philip ct al. (2004) used a narrow band XUV laser source to 
provide a substantially improved (although incomplete) line list, where the errors for the 
highest energy level states are an order of magnitude smaller than those from Abgrall 
et al. (1993a) and Abgrall et al. (1993b). HoUenstein et al. (2006) performed a similar 
experiment, completing the line list of Philip et al. (2004). The fractional accuracies for 
these wavelength measurements are of order ~ 5 x 10~^. 

Ubachs & Reinhold (2004) used the wavelengths of Philip et al. (2004) to analyse the ab- 
sorbers in the spectra of Q0528-250, Q0347-383 and Q1232+082. For the combined data, 
they found that A/x//x = (-0.5 ± 1.8) x 10"^ using the RRM. Omitting the Q0528-250 
data, which is of poorer quality, they obtained A^u/^ = (1.9 it 1.5) x 10~^. 

Ivanchik et al. (2005) analysed the Q0347— 383 absorber, as well as a new one toward 
Q0405— 433 (from a spectrum obtained using VLT/UVES) using the wavelengths of Philip 
et al. (2004) to obtain A/i/^ = (1.47 ± 0.83) x 10"^ for the system towards Q0347-383 
and l^ii/fi = (2.11 ± 1.39) x 10~^ for the system towards Q0405-443 using the RRM. 

Reinhold et al. (2006) used the wavelength data of Philip et al. (2004) and HoUenstein 
et al. (2006) to examine the spectra of Ivanchik et al. (2005) (Q0347-383 and Q0405-443). 
They recalculated the Ki values in a significantly more accurate fashion, as described 
earlier. They noted that the Ki values for highly excited states changed significantly as 
a result of the post-BOA corrections, and that all Ki values experienced a systematic 
shift due to the adiabatic correction. They found that A/i//i = (2.06 it 0.79) x 10^^ for 
Q0347-383 and A/i/^ = (2.78 ±0.88) x 10"^ for Q0405-443 using the RRM. It is worth 
noting that their points with Ki <{) demonstrate an unusually small scatter, and indeed 
they conceded that their result differs from previous works primarily as a result of the 
addition of new laboratory wavelengths for the = and u = \ Lyman bands, which 
correspond to these Ki values. A combined weighted fit yielded A/i/// = (2.4 it 0.6) x 10~^ 
using the RRM, although = 2.1 for the C,i values about the linear model for Q vs Ki, 
suggesting that unmodelled errors exist. An unweighted fit gave A^u/^u = (2.0ib0.6) x 10~^. 
This result seems to suggest that was larger in the past at the > 3.5(T confidence level. 
The result of this paper formed the motivation for the analysis of this chapter. The analysis 
of Reinhold et al. was explained in considerably more detail in Ubachs et al. (2007). 

A potential systematic effect in the analysis of molecular hydrogen concerns spatial seg- 
regation of the different J-levels of the ground state. Jenkins & Peimbert (1997) noted 
that there appeared to be small velocity shifts between J = and J = 3 transitions of 
H2 observed towards Q Orionis A, with IS.v ~ 0.8kms~^. However, this effect was not 
observed toward other stars (Jenkins et al., 2000). Levshakov et al. (2002) claim to detect 
a gradual shift in Zabs with increasing J in their analysis of the z = 3.025 absorber toward 
Q0347— 383. Murphy (2002) notes that similar shifts of similar magnitude to those seen in 
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Jenkins &: Peimbert (1997) would lead to a systematic error in A/i//i of ~ 4 x 10~^. How- 
ever, Reinhold et al. (2006) addressed these concerns by showing no significant correlation 
exists between Q and J or Q and A^. 

Wendt & Reimers (2008), Thompson et al. (2009) and Wendt & Molaro (2010) ah inves- 
tigated Q0347-383 and Q0405-443 to examine the results of Reinhold et al. (2006). We 
defer discussion of these results to section 3-5.1 so that they can be interpreted in the 
context of this work, which was reported first in King et al. (2008). 

The results of King et al. (2008) (this work) are given in tables 3.5 and 3.6. 

Malec et al. (2010) analysed the z = 2.059 molecular hydrogen system toward J2123— 0050 
using 86 H2 transitions from Keck observations. They also use 7 HD (deuterated molecular 
hydrogen) transitions in their analysis — the first constraint on /i variation to utilise HD. 
They found that Afi/iJ, = {+5.6 ± 5.5statisticai ± 2.9systematic) x 10"^. The systematic error 
contribution arises predominantly from wavelength calibration uncertainties, however the 
estimate is model dependent. Like this work, Malec et al. applied the DCMM to reduce 
the number of free parameters in the fit, and improve the robustness of the result. They 
also modelled the Lyman-a forest in a similar fashion to this work. 

3-2.2 Afi/fi from ammonia 

The inversion transitions of ammonia (NH3), which result from the situation where the 
nitrogen atom tunnels from one side of the molecule through the potential barrier due 
to the hydrogen atoms to the other side, are strongly sensitive to a change in n, with 
K ~ 4.2 (Flambaum & Kozlov, 2007). Murphy et al. (2008a) and Henkel et al. (2009) 
compared the inversion transitions of ammonia with rotational molecules to determine very 
stringent limits on Afi/fi at 2; < 1. Murphy et al. (2008a) used B0218-I-357 to find that 
|A/i//x| < 1.8 X 10-^ (95% confidence) at z = 0.68, whilst Henkel et al. (2009) concluded 
that |A/u//x| < 0.9 x 10~^ {2a confidence) from PKS1830-211 at z = 0.89. 

The ammonia method is theoretically preferable to the analysis of molecular hydrogen, as 
the sensitivity coefficient is larger by a factor of ~ 100, and the transitions are not blended 
with the Lyman-a forest. However, there are some drawbacks. In particular, quasars are 
point sources in the optical but are manifestly extended sources in the radio. This implies 
that the clouds from which the ammonia transitions arise may not be spatially co-located 
with the rotational transitions. Spatial offsets in the radial direction will lead to velocity 
differences, which would mimic a change in fi. Molecular hydrogen is much less prone to 
this problem, because one is comparing transitions which arise from the same molecule 
(albeit from different J-levels of the ground state) . If one only compares transitions which 
arise from the same J-level, then the concern of spatial segregation is eliminated. Perhaps 
more importantly, there are few sources known which possess the necessary ammonia 
transitions, and none at high {z > 1) redshift. 
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3-3 Methods &; methodology 

Our goal was to re-analyse the work of Reinhold et al. (2006) and confirm or dispute the 
apparent evidence for a change in fi. Our methodology differs from that of Reinhold et al. 
in four significant ways: 

1 . We model the Lyman-a forest in the vicinity of the H2 transitions using Voigt profiles 
(not polynomials); 

2. We use the DCMM rather than the RRM (although we retain the RRM as a check 
on our results); 

3. Our spectra have been re-reduced using a new thorium-argon wavelength calibration 
algorithm, which yields substantially improved wavelength calibration, and; 

4. We correct for under-estimation of the flux uncertainties in regions of low flux in 
VLT/UVES spectra. 

3-3.1 Spectral data 

The first stage of our analysis examined the absorbers in Q0347— 383, Q0405— 443 and 
Q0528— 250. We are grateful to H. Menager and M. Murphy, who reduced the exposures 

from 2D format to ID format, and then co-added the ID exposures within UVES POPLER. 

They also cleaned the spectrum to remove the effect of data problems, including removing 

cosmic rays which are not removed by the automatic algorithm within uvES popler, 

ghosts caused by reflections within the UVES enclosure and other inconsistencies between 
the contributing exposures. The analysis of the absorbers in each of these systems leads 
to the results in section 3-4. 

The exposures used by Ivanchik et al. (2005) which contribute to the spectra for Q0347— 383 
and Q0405— 443 were obtained on VLT/UVES in January 2002 and 2003. The exposures 
contributing to their spectrum of Q0347— 383 were obtained under program IDs 68. A- 
0106(A) and 68.B-0115(A), whilst those contributing to Q0405— 443 were obtained under 
program ID 70.A-0017(A). For each object, nine exposures of 1.5 hours each were taken 
with a slit width of 0.8 arcseconds, yielding a resolution of i? ~ 53, 000 and a SNR of 
between ~ 30 and ~ 70 over the wavelength range 3290 to 4515A. Prevailing seeing was 
sub-arcsecond. The ThAr calibration spectra were taken before and after the science ex- 
posures, and so the wavelength calibration should be good. Ivanchik et al. (2005) note 
that the temperature drift at UVES is sufficiently small that the uncertainty introduced 
into wavelength calibration as a result of temperature drift is negligible. Further details 
can be found in Ivanchik et al. (2005). 
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Besides the exposures noted above, for Q0347— 383 we incorporated exposures from pro- 
gram ID 60.A-9022(A), although these contribute only an additional 2.6 hours. For 
Q0405— 443, we also made use of additional exposures under program IDs 68.A-0361(A), 
68.A-0600(A) and 68.A-0361(A). 

The exposures which contribute to the spectrum for Q0528— 250 were obtained with 
VLT/UVES between 2001 and 2002 under program IDs 66.A-0594, 68.A-0600 and 68.A- 
0106, with a total exposure time of 21.9 hours. A slit width of 1.0 arcseconds was used 
for all exposures. Seeing was generally sub-arcsecond. 

However, Q0528— 250 was re-observed in late 2008/early 2009 under program ID 82. A- 
0087, with exposures totalling approximately 8.2 hours, after our analysis of the previous 
spectrum of Q0528— 250 was complete. We have re-analysed the absorber in Q0528— 250 
using these new exposures to provide an additional constraint on A/u//i. For clarity, we 
refer to spectrum created from exposures under under program IDs 66.A-0594, 68.A-0600 
and 68.A-0106 as Q0528:A, and give the results for this analysis in section 3-4. We refer 
to the spectrum generated from program ID 82.A-0087 as Q0528:B2, and discuss this 
particular spectrum in section 3-6. 

For our analysis of Q0405-443, Q0347-383 and Q0528-250 (Q0528:A), we have used 
Ki values and laboratory wavelength values from Ubachs et al. (2007). However, for 
our second analysis of Q0528— 250 (Q0528:B2) we used the data from table 1 of Malec 
et al. (2010), which includes the work of Ubachs et al. (2007) but also includes newer 
measurements from Bailly et al. (2009). 

3-3.2 Wavelength calibration 

When searching for variations in at the 10^^ to 10~^ level, the spectra must be accu- 
rately calibrated at the mA level. The wavelength scale of the science echelle exposure 
is calibrated through a secondary calibration exposure, usually using a thorium-argon 
(ThAr) lamp, which produces a large number of well-measured transitions across the total 
wavelength coverage of an optical telescope. Murphy et al. (2007a) considered the line list 
used by the UVES pipeline in detail, and considered different factors which may introduce 
errors into the wavelength calibration process. One potential problem is the use of blended 
ThAr lines which are unresolved in typical UVES spectra (i? ~ 30, 000 to 70, 000). Use of 
such lines will cause bias in the ThAr line centroid measurement and therefore in the wave- 
length calibration. Another factor considered is the use of weak lines, which may cause 
false identification by the UVES pipeline. Similarly, they also consider the fact that the 
existing ThAr line lists contain inaccuracies, and therefore they reject ThAr lines which 
have large residuals. The result of this is a new ThAr line list for which the wavelength 
calibration residuals (RMS ~ 35 ms~^) are a factor of three better than those achieved 
using the ESO Hue Hst or the line hst of de Cuyper & Hensberge (1998) (RMS ~ 130 
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ms~^). Murphy et al. note that not only are the random caHbration errors significantly 
improved through the use of this line list, but the existence of long-range variations with 
peak-to-peak amplitudes of up to ~ 75 ms~^ are reduced. Our spectra have been wave- 
length calibrated using the calibration algorithm of Murphy ct al. (2007a), and therefore 
our spectra should have significantly better calibration than the spectra used in previous 
analyses. 

3-3.3 Correction for underestimated flux errors 

As described in section 2-2, the uncertainty estimates on flux values in the base of sat- 
urated lines appear to be too low. To correct for this in the spectra for Q0405— 443, 
Q0347— 383 and our first analysis of Q0528— 250 (Q0528:A) we applied the heuristic cor- 
rection described in section 2-2.1. We show the results of a number of measurements of 
the ratio of the RMS of pixels in the base of saturated lines to the average of the RMS 
array in table 3.3 (the RMS array is a modified version of the flux error array produced 

by UVES POPLER which attempts to account for inter-pixel correlations). It is clear that 

the error estimates are too small by a factor of approximately 2. For these three quasar 
spectra, we modify the error arrays using the functional form in equation 2.4, choosing b 
such that (1 -|- b)^^^ = R as given in table 3.3. 

Table 3.3: Evidence for understimated uncertainties on flux values in the base of satu- 
rated lines. For each quasar spectrum, n shows the number of measurements taken in the 
base of saturated lines. The quantity R = aj /a is the ratio of the RMS of the flux array 
to the average of the RMS array (a modified version of the error array which attempts 
to account for inter-pixel correlations). This quantity should be ~ 1 if the error arrays 
correctly reflect the noise in the spectral data, but will be larger than 1 if the error arrays 
are underestimated. <t^ indicates the standard error on this quantity. The final column 
indicates the deviation of R from the expected value of unity, expressed as a multiple of 
the standard error. These data clearly indicate that the flux uncertainty estimates in the 
base of saturated lines are too small by a factor of approximately 2. 
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In figure 3.5 we show the measurements taken in the base of saturated lines for Q0405— 443. 
It is clear that there is significant scatter between the individual measurements, and there- 
fore the functional form of equation 2.4 will only be approximately correct. Additionally, 
there appears to be a weak wavelength dependency, with the problem worse in the blue end 
of the spectrum. We have not investigated whether this is a true wavelength dependency, 
or whether it is simply a function of SNR (which is correlated with wavelength because 
the spectrograph throughput is worse in the blue end of the spectrum). Equation 2.4 can 
be modified to account for a wavelength dependence of the observed problem, but we did 
not do this. 
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Figure 3.5: The factor by which errors in the base of a selection of saturated lines are 
underestimated in the spectrum of Q0405— 443. The quantity R = cjf/a is the ratio of the 
RMS of the flux array to the average of the RMS array (a modified version of the error 
array which attempts to account for inter- pixel correlations). The dotted blue line shows 
the expected value of 1 if the error arrays correctly account for interpixel dispersion. Note 
that there is significant scatter between individual measurements, and also that there may 
be a correlation of the effect with wavelength. 



3-3.4 Free parameters & physical assumptions 



The observed transitions of molecular hydrogen consist of transitions from the ground 
states to upper excited states for the Lyman and Werner bands. By "ground states", we 
refer to the subdivision of the lowest energy level into levels with different angular momen- 
tum J. The different J- levels of the ground state have different relative populations, which 
depend on the temperature of the gas cloud but also on the influence of non-equilibrium 
processes (e.g. collisions). The non-equilibrium processes simply cause the relative popu- 
lations in the different J-levels to be different from the Boltzmann distribution (Spitzer 
(Jr) & Cochran, 1973; Levshakov & Varshalovich, 1985). In particular, transitions with 
high J display apparent overpopulation relative to low- J transitions. Clearly, transitions 
which arise from the same ground state must have the same 6-parameter and the same 
column density. 

We make further physical assumptions which reduce the number of free parameters in 
the fit. The most important of these, noted earlier, is that all transitions arise from the 
same location, and therefore have the same z. For H2 absorbers with multiple velocity 
components, this means that corresponding components in all transitions have the same z. 
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We also explore whether we can impose the requirement that all transitions have the same 
6-parameter, irrespective of J (for H2 absorbers with multiple velocity components, this 
means that corresponding components in all transitions have the same b). By minimising 
the number of free parameters in the fit the optimisation process should be more robust. 
Similarly, by imposing physical constraints on the problem it is more likely that our derived 
value of A///// will be accurate. 

In order to address the concerns in section 3-2.1.2 relating to continuum fitting, in regions 
where the local continuum is uncertain we allow for a linear continuum which is deter- 
mined simultaneously with all other parameters. The uncertainty in determining the local 
continuum therefore propagates in to the uncertainty on A/x/^u. 

We note that in addition to the under-estimation of flux uncertainty in regions of low 
flux, there appears to be residual flux in the base of many saturated lines. The typical 
magnitude of this effect is about 2% of the local continuum. Whilst weak sky emission 
should be subtracted as part of the flux extraction, it appears that the mid AS pipeline 
systematically underestimates the subtraction required, leading to the observed effect. A 
similar problem has been noted previously by Malec et al. (2010), albeit in relation to a 
Keck/HIRES spectrum of J2123— 0050. We attempt to correct for this problem by allowing 
the zero level to vary in any region which includes absorption lines which are saturated, 
or nearly saturated. As for the continuum, the uncertainty in determining the zero level 
propagates into the uncertainty on A/x/|U. 

3-3.5 Modelling the Lyman-a forest with molecular hydrogen 

The structure of the Lyman— a forest is unknown a priori^ and therefore must be modelled 
from the observed flux profile. Our model of the molecular hydrogen transitions with the 
forest was built up iteratively. With knowledge of the redshift of the molecular hydrogen 
absorbers, in each spectra we searched for molecular hydrogen transitions which we con- 
sidered to be potentially usable. We consider potentially usable transitions to be those for 
which the molecular hydrogen transition can be visually distinguished from its surrounds. 
This necessarily precludes the use of H2 transitions in regions of near zero flux, but in any 
event these transitions would contribute no meaningful constraint on A/i//Li. 

Prom a list of potentially usable transitions, we then selected a buffer region around the 
H2 transition, where the region should be large enough to include any absorption feature 
which might overlap with the H2 transition. In general, we attempted to ensure that the 
fitting region was sufficiently large so as to return to the local continuum, although this 
was not always possible. In each of the fitting regions, we modelled the molecular hydrogen 
transition and then modelled all surrounding features as H i. To do this, we added and 
removed H i components to attempt to achieve a statistically satisfactory model, using 
the criteria set out in section 2-1.2. Note that although most transitions observed in the 
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forest are indeed due to H I, there are also metal transitions from other absorbers along 
the line of sight (including galactic and atmospheric lines). The identification of the origin 
of these transitions is not necessary if they do not overlap with the H2 transitions; we 
simply modelled them as H i in order to have a physical model for them. We describe the 
treatment of metal lines which overlap with H2 lines below. For all transitions assumed 
to be H I (which we refer to hereafter as just H i transitions), we use only the A1215.7 
transition rather than the whole Lyman series, to prevent line misidentification spuriously 
impacting regions blueward of that transition. Where Lyman-/3 transitions exist in the 
blue region of the spectrum, we simply modelled them with additional H i components. 

We then combined models from the regions fitted individually into a model where the 
regions are fitted simultaneously. As the line parameters for the individual H2 transitions 
were independent when the regions are fitted independently, at this stage we imposed 
physical restrictions on the transitions by tying certain parameters together. The H2 ab- 
sorbers in Q0347— 383 and Q0405— 443 appear to be well modelled by a single component. 
For these absorbers, we required that the redshifts of all of the transitions are the same 
and therefore tie them together within vpfit. We also required that the 6-parameters 
be the same. Although the line strengths can be in principle determined from the os- 
cillator strengths and a single column density, we allowed the column densities for each 
transition to be determined independently (effectively fitting the oscillator strengths as 
free parameters). 

The absorber in Q0528— 250 requires more than one component to model the structure 
correctly. We describe how we determined the velocity structure below in section 3-4.1.1. 
For this absorber, we required that the redshifts of corresponding components be the same. 
As above, we fitted the column densities for each transition as free parameters. However, 
we wished to ensure that a physical consistency is maintained, in that the ratios of the line 
strengths between different components should be the same for transitions arising from 
the same ground state. We therefore imposed the requirement that the ratio of the column 
densities between the different components was the same for transitions arising from the 
same J-level. In this way, the total column density (effectively, oscillator strength) for 
each transition was a free parameter, but the ratios of the individual column densities 
within each transition were constrained. 

We then iteratively refined the fit by alternately allowing vpfit to minimise for a par- 
ticular model, then attempting to improve that model through the addition and deletion 
of H I components to obtain a robust model according to the criteria in section 2-1.2. 

During the iterative process, it can become clear that a molecular transition is blended with 
another line (presumed H i) when it was not thought to be from a fit to just that region. 
This is because the information from the other molecular hydrogen transitions imposes a 
strong constraint on the 5-parameter(s) and redshift(s) of that transition, thus uncovering 
apparently hidden blends. These blends necessitate the addition of H i components that 
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overlap with the H2 transition in question. With the addition of extra H I transitions, 
an acceptable fit can generally be achieved. This demonstrates the utility of fitting all 
transitions simultaneously: otherwise inconspicuous blends are generally revealed. In a 
few instances, the transitions which had to be included to achieve a statistically acceptable 
fit had extremely narrow b parameters (6 < 5kms~^). In this case, it is likely that the 
blend is a metal line from an unknown absorber along the line of sight. As a result, 
we rejected the transition. The reason for not accepting transitions affected by narrow-6 
interlopers is that any inaccuracy in modelling the interloping transitions could lead to 
a significant bias in measuring the II2 line position — the narrow 6-parameter(s) of the 
interloping transitions means that the absorption they cause varies rapidly across the II2 
line profile. Ultimately, the joint fit of all the molecular hydrogen transitions allows the 
detection and rejection of transitions which are likely to be contaminated by metal lines. 
Rejecting transitions which are suspected to be contaminated cannot bias A^/^ away from 
zero. Moreover, this should not bias A/i//x significantly. If the suspicion of contamination 
in particular lines was in fact due to A|u//u 7^ 0, we would expect to see this problem more 
frequently, and more obviously, for transitions with larger \Ki\. The number of transitions 
rejected was small, and did not appear to be correlated with \Ki\, and hence it is unlikely 
that we are biasing Afi/fj. towards zero. 

It is possible to add too many H i components to a particular region, leading to "over- 
fitting". Over-fitting is undesirable for several reasons. The primary reason is that it 
means that another, simpler model can explain the data better than the over-fitted model. 
Parsimony should be strongly valued in model selection, as noted in section 2-1.2. Perhaps 
more importantly, it means that the performance of the optimisation algorithm can be 
substantially impaired. With significant over-fitting, convergence to the minimum can 
be excessively slow. In extreme cases, convergence may not occur at all. Over- fitting can 
be detected through two means: 

1. The addition of components which increase the AICC suggests that the components 
are not supported by the data. If the AICC significantly decreases upon removal of 
the components, this suggests that the model was over-fitted. 

2. Over- fitting causes the uncertainty estimates on the parameters of the components 
in question to be excessively large (this point was discussed by Gill ct al., 1986). In 
fact, this is often a good way to directly identify components which are potentially 
unnecessary; the AICC relates to the model as a whole and therefore cannot suggest 
which components may be unnecessary. In particular, H i transitions with (Tiog^g n ^ 
1.0 or ab/b > 1 are certainly suspicious. In regions with substantial over-fitting, 
errors can easily be substantially larger than this. The numerical cause of these large 
errors is strong relative degeneracies between parameters. That is, is almost flat 
in some direction in the parameter space relating to the offending transitions. It is 
this flatness in which is the cause of poor convergence. 
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(a) Nevertheless, the presence of large errors on some components does not mean 
that they are unnecessary. In particular, the column densities for transitions 
which are saturated can be very poorly determined. This necessarily means 
that saturated H i transitions will have large errors on the column density. 

Because of the impact of over-fitting on the convergence of vpfit, we spent considerable 
effort trying to identify cases of over-fitting, and removing H i components as necessary 
to minimise the problem. 

Our final fits were obtained where we were not able to obtain any statistically appreciable 
improvement. 

In practice, it is not important that the structure of the forest be modelled with total 
accuracy in all regions. The goal is simply to fit all observable structure with a plausi- 
ble model, so that a plausible background flux model exists against which the molecular 
hydrogen model is constructed. Although the uncertainty which propagates into the de- 
termination of A/x//x is likely to be somewhat incorrect, proceeding in this fashion at least 
attempts to account for the uncertainty in determining the forest structure. 

The process of fitting the spectra constitutes almost all of the effort in obtaining A/i//x. 
The speed of the optimisation algorithm unfortunately degrades rapidly with increased 
numbers of parameters. For instance, consider the effort required to calculate the partial 
derivatives of ^ with respect to each of the parameters^. For n Voigt profiles, one needs 
3n parameters, and therefore there are 3n first-order partial derivatives^. For each of 
these derivatives, n Voigt profiles must be generated. Similarly, the spectral density of 
lines is approximately constant with wavelength, which implies that the number of pixels 
at which the profile must be evaluated scales as 0{n). Thus, the time required to evaluate 
the partial derivatives at each iteration scales as 0{rc'\ Even after parallelisation, the 
time required for one step of the iteration process for our model for Q0528— 250 on a 
quad-core Intel 3.2GIIz 17 processor is about 10 minutes. Many iterations (typically ~10 
to 30) are needed to make the model relatively close to optimal, with potentially many 
more needed if strong degeneracies exist. Once the model is relatively optimal, human 
interaction is then required to look for parameter degeneracies, areas of poor fitting and 
the appropriateness of various parameters. Adjustments are made to the model, and 
the optimisation restarted. One can easily see how this process becomes extremely time- 
consuming. It is regrettable that the time required to obtain a final, satisfactory model for 
a particular spectrum is of the order of months. We discuss future avenues of improvement 
in this regard later. 

■'Please see section 7-1.2 for more details on the theory behind the optimisation. 

^Ignoring for the moment the parameters which describe the linear continuum fits and the zero-level 
determination. 



51 



52 



— the proton-to-electron mass ratio 



3-3.6 Other details 

The Voigt profile model must be convolved with a model for the instrumental profile in 
order to obtain a model which can be compared with the observed spectrum. In the 
case where all exposures which contribute to a spectrum are taken with the same slit 
width, and the quasar image fills the spectrograph slit uniformly, then the instrumental 
profile will be well-described by a Gaussian. The velocity width of this Gaussian can be 
determined from the ThAr spectrum. However, in the case where exposures are taken 
with different slit widths, or where the seeing fluctuates such that in some exposures the 
seeing is significantly better than the slit width, then the instrumental profile is difficult 
to model accurately. 

We have assumed that the instrumental profile is Gaussian, with a velocity FWHM of 
6kms-i for our initial analysis of Q0405-443, Q0347-383 and Q0528-250 (Q0528:A). 
For our analysis of Q0528:B2, we have used an instrumental FWHM of 5.45 km s~^, which 
appears to better reflect the observed profile in that spectrum. Small errors made in 
determining the instrumental resolution will necessarily lead to inaccuracy in modelling 
the spectrum. However, because the Voigt profile is symmetric, these errors should not 
significantly bias A/u/^u if a sufficiently large number of molecular hydrogen transitions are 
used. 

3-3.7 Comments on VPFIT 

We have used a modified version of vpfit v9.5 to perform our analysis of the molecu- 
lar hydrogen data. Early investigations suggested to us that vpfit was not adequately 
converging for the full fits, which contain thousands of free parameters. We modified 
VPFIT to augment the existing Gauss-Newton optimisation algorithm with the Levenberg- 
Marquardt algorithm, and found that this produced reliable convergence. We describe 
this further in section 7-1.2 in the context of the results of that chapter. We are grateful 
to R. Carswell for merging our algorithm into the release version of VPFIT. 

Malec et al. (2010) have investigated the convergence of vpfit when determining A^/^ 
from the z = 2.059 absorber toward J2123— 0050 using Monte Carlo methods applied 
to synthetically generated spectra. They find under 420 different realisations of a noisy 
spectrum that VPFIT returns the correct value of A/i//i with an appropriate statistical 
uncertainty. The noise was generated such that a was 0.8 times the error array at each 
pixel. The use of 0.8 rather than 1.0 was to ensure that marginally required Lyman- 
a blends were always required in the simulated spectra. We therefore believe that the 
parameter estimates and uncertainties produced by vpfit here are likely to be reasonable. 
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3-4 Results 

3-4.1 Description of the absorbers 

The absorber at z = 3.025 toward Q0347— 383 appears to be well modelled by a single 
H2 velocity component. The absorption system at 2; = 2.595 toward Q0405— 443 contains 
one main velocity component, with another weaker component. The two components are 
separated by ~ 13 km s~^ in velocity space. However, many of its transitions are weak 
or heavily blended, and so we have not utilised the second component. Where the weak 
component is observed, we modelled it as H i in order to ensure that the observed spectral 
features are accounted for. This also has the advantage of placing our analysis of this 
absorber on a comparable basis to that of Reinhold et al. (2006), who also analysed only 
the strong component. The structure of the absorber toward Q0528— 250 is described 
below. 

3-4.1.1 Velocity structure of H2 in Q0528-250 

The system toward Q0528— 250 presents with complex structure. We show an exemplary 
J = 4 molecular hydrogen transition in figure 3.6. Ledoux et al. (2003) reported the de- 
tection of multiple velocity components, and Srianand et al. (2005) modelled the absorber 
with two components. Two components are plainly visible as a substantial asymmetry in 
every line (see figure 3.6). We have tried modelling the absorber toward Q0528— 250 with 
2, 3 and 4 velocity components. 

To determine whether more than two velocity components were required, we firstly consid- 
ered the AICC. A particular model with three H2 components compared to a model with 
two has AAICC = —120.0. That is, the three component model is very strongly preferred 
over the two component model. A similar model with four H2 components compared to 
a model with three has AAICC = —20.9, which again indicates that the model with four 
components is strongly preferred. Using the F-test, the probability that the reduction in 
from using three components instead of two is due to chance is p = 4 x 10^^*^. Compar- 
ing a model with four velocity components to one with three gives p = 1.8 x 10^^. This 
statistical evidence suggests that a model with four velocity components is appropriate. 
The use of a five component model produced a fit that was highly unstable, by which 
we mean that some of the H2 components were rejected by vpfit as being statistically 
unnecessary. As a result, we used the four component model as our primary model. 

Notwithstanding the significant statistical evidence for four components, it is interest- 
ing to consider the "per-transition" AAICC. With n = 64 transitions, AAICC2-s.3/n = 
—1.88 and AAICC3_!.4/n = —0.33. Using the Jeffreys' scale (.Jeffreys, 1961), then only 
~ 6 average transitions are necessary to conclude that there is very strong evidence 
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Figure 3.6: Demonstration of velocity structure in Q0528— 250. The transition at ~ 
4042A is a molecular hydrogen transition. The presence of at least two components is 
clearly demonstrated visually. The existence of more components must be determined 
through appropriate statistical techniques. The green line is a model fitted to the data (in 
black) . 



(AAICC < —10) for three components. However to conclude that there is very strong 
evidence for four components, one needs ~ 30 average transitions. Thus it is clear that a 
large amount of spectral data is required to detect the fourth component. 

We note that the strength of the statistical evidence for 3 and 4 components depends on a 
number of factors, including the correctness of the flux uncertainties and the choice of the 
correct instrumental resolution. Therefore, the true statistical evidence after considering 
unmodelled uncertainties is necessarily smaller. 

We also noted for this absorber that transitions of increasing J appear to have smaller b 
parameters; we show this effect in figure 3.7. In the spectrum, this has the effect of making 
the velocity structure more pronounced for transitions with higher J. For this reason, in 
deriving our estimates on Afi/ fi we allow for transitions of different J to have different b 
parameters. 



3-4.2 Transitions used &z fits 



We present in table 3.4 a list of the transitions used in each of the quasar fits. In figures 
3.8, 3.9 and 3.10 we show the distribution of the Ki values and J-levels with rest wave- 
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Figure 3.7: Relationship of b with J for the four components of the H2 fit for the 
z = 2.811 absorber toward Q0528— 250. The panels CI through C4 show the relationship 
for the four components of the fit in a model where transitions with different J are allowed 
to have different b parameters. The statistical uncertainties given are derived from the 
covariance matrix of the fit. b cannot be Gaussian if 6/crf, < 1, and so the fact that some 
error bars overlap with 6 = is of no actual consequence. A more thorough investigation 
would explore the actual confidence region, but that is not necessary for these purposes. 
The panel C4, corresponding to the weakest (4th) component reveals no obvious trend 
of b with J, but this is expected due to the low column density (that is, the statistical 
errors are large). However, in panels CI through C3 a trend is noted that for transitions 
with higher J the 6-parameters are smaller. This is clearly seen in the spectra, where the 
velocity structure is more obvious for higher J transitions. The results here imply that 
forcing all transitions for a particular component to have the same b parameter may not 
be physically realistic. Such an assumption could lead to errors in determining Afi/fi. 

length for the transitions used in our analysis of Q0405-443, Q0347-383 and Q0528-250 
respectively. The Voigt profile fits to Q0405-443, Q0347-383 and Q0528-250 may be 
found in appendices A, B and C respectively. 



3-4.3 Results for Q0405-443, Q0347-383, and Q0528-250 



We present in tables 3.5 and 3.6 the results of the DCMM and RRM respectively applied 
to the absorbers in the spectra of Q0405-443, Q0347-383, and Q0528-250 (Q0528:A). 
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Figure 3.8: Relationship of Ki and J with Aq for the transitions used in Q0405— 443. 
Upper panel: the sensitivity coefficients, Ki, for the transitions used in our analysis of 
Q0405— 443 (dark blue points) and not detected or not fitted (grey points). Lower panel: 
the distribution of transitions with wavelength according to their J-level. 
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Figure 3.9: Relationship of Ki and J with Aq for the transitions used in Q0347— 383. 
Upper panel: the sensitivity coefficients, Ki, for the transitions used in our analysis of 
Q0347— 383 (dark blue points) and not detected or not fitted (grey points). Lower panel: 
the distribution of transitions with wavelength according to their J-level. 
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Table 3.4: Transitions used in our fits for Q0405-443, Q0347-383 and Q0528-250. n 
gives tlie number of transitions used. 

Quasar spectrum Zabs n Transitions used 

Q0405-443 2.595 52 LOPl, L0P2, LORD, LORl, L0R2, L0R3, L1P2, L1P3, 

L1R3, L2P3, L2R2, L3P2, L3P3, L3R2, L3R3, L4P3, 
L4R2, L4R3, L5P2, L5R2, L5R3, L6P2, L6P3, L6R3, 
L7P2, L7P3, L8P2, L8P3, L8R2, L8R3, L9P2, L9P3, 
L9R2, L11P3, L12P2, L12R0, L12R3, L13P2, L14R2, 
L15R2, L15R3, L16P2, W0R3, W2P2, W2R2, 
W2Q2, W3R2, W3Q3, W4P2, W4R3, W4Q2 
Q0347-383 3.025 68 L1P2, LlRl, L1R2, L2P2, L2P3, L2R0, L2R1, L2R3, 

L3P1, L3P2, L3P3, L3R0, L3R1, L3R2, L3R3, L4P2, 
L4P3, L4R1, L4R2, L4R3, L5P1, L5P2, L5R1, L5R3, 
L6P2, L6P3, L6R2, L6R3, L7P2, L7P3, L7R0, L7R1, 
L7R3, L8P1, L8P3, L8R0, L8R1, L8R2, L9P1, L9R1, 
L9P2, LlOPl, L10P3, LIORO, LlORl, L10R3, LllPl, 

L11P2, L12P2, L12P3, L13R1, L14R1, L16R2, 
WORl, W0R2, W0Q2, W0Q3, W1R2, WlQl, W1Q2, 
W2P2, W2R1, W2R3, W2Q2, W2Q1, W2Q3, W3Q1 
Q0528-250 2.811 64 LORD, LORl, LlPl, L1P2, LIRO, LlRl, L1R2, L1R3, 

L2P1, L2P2, L2R2, L2R3, L3P1, L3P2, L3P3, L3P4, 
L3R2, L3R3, L3R4, L4P2, L4P3, L4P4, L4R2, L4R3, 
L5P2, L5P3, L5P4, L5R3, L5R4, L6P3, L6P4, L6R3, 
L7P3, L7R2, L7R3, L8P3, L9R2, L9R3, LlOPl, 
L10P2, L10P3, L10P4, L10R3, L12R0, L13P1, 
L13P2, L13R2, L13R3, L15P3, L15R2, L15R3, 
L16P1, L16R2, L17R3, W0P2, W0R2, W0R4, W1R3, 
W1Q2, W2P3, W2R2, W2R3, W4P3 



We note that the use of the DCMM results in a substantial reduction in the number 
of free parameters compared to the RRM. For Q0405— 443, the DCMM yields 51 fewer 
parameters and for Q0347— 383 it yields 67 fewer parameters. Our preferred result is that 
from a weighted mean of the DCMM results, which yields A////i = (2.6 it 3.0) x 10~^, 
compared with A^//i = (24 it 6) x 10~^ from Reinhold et al. (2006). We prefer the results 
from the DCMM over the RRM for the reasons given in section 3-2.1.3. We therefore find 
that our results are inconsistent with those from Reinhold et al., and are unable to produce 
a non-zero result. It is difficult to determine whether the three data points are consistent 
about the weighted mean, as the test has low statistical power to reject consistency 
for small v. Nevertheless, = 1-47 for the three DCMM data points about the weighted 
mean. A value of this large or larger has a probability p = 0.48 of occurring by chance. 
Thus, we can say that our results appear to be consistent — at least under a weighted 
mean model — with no evidence for excess scatter due to unmodelled systematic effects. 

We show in figure 3.11 on page 59 a reduced redshift plot for Q0405— 443 and Q0347— 383, 
which has gradient A^u/^u = (8.5 it 5.7) x 10~®. 
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Figure 3.10: Relationship of Ki and J with Aq for the transitions used in Q0528— 250(A). 
Upper panel: the sensitivity coefficients, Ki, for the transitions used in our analysis of 
Q0528— 250(A) (dark blue points) and not detected or not fitted (grey points). Lower 
panel: the distribution of transitions with wavelength according to their J-level. 

Table 3.5: Direct minimisation method (DCMM) constraints on A^u/^ for Q0405— 443, 
Q0347— 383 and Q0528— 250. n is the number of transitions. Xu the reduced of the 
spectral data about the Voigt profile model. The weighted mean is also given. 



Quasar spectrum 


Afi/fi ~ DCMM 




^abs 


n 


Q0405-443 


(10.1 ±6.6) X 10-6 


1.42 


2.595 


52 


Q0347-383 


(8.2 ±7.5) X 10-6 


1.28 


3.025 


68 


Q0528-250(A) 


(-1.4 ±3.9) X 10-6 


1.22 


2.811 


64 


Weighted mean 


(2.6 ±3.0) X 10-6 


n/a 


2.81 


n/a 



We note that our in our RRM linear fit to d against Ki, the Q values demonstrate good 
consistency with the linear model, with xt = 1-01 and 1.13 for Q0405— 443 and Q0347— 383 
respectively, and Xu — 1-06 for the two data sets fitted together. This contrasts with the 
results of Reinhold et al. (2006), where they found that Xu = 2.1. The smaller xt ™ 
our case is likely to arise from a combination of: i) better wavelength calibration in our 
spectra compared to the spectra of Reinhold et al.; ii) the simultaneous fitting of the 
forest with the H2 transitions, which must increase the uncertainties on the redshifts of 
each H2 transition; and, in) the fact that the flux uncertainties in regions of low flux 
are under-estimated, and not corrected for, in the data of Reinhold et al., which means 
that the statistical uncertainties on redshifts will be under-estimated. It is reassuring that 
~ 1 for our RRM fits. 
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Table 3.6: Reduced redshift method (RRM) constraints on A/i//i for Q0405— 443 and 
Q0347— 383 and Q0528— 250. n is the number of transitions. The RRM cannot be appHed 
to Q0528— 250 for reasons set out in the text, xt gives the reduced about the hnear 
fit. 

Quasar spectrum A/i/// — RRM xl ^abs n 
Q0405-443 (10.9 ± 7.1) x 10"^ 1.01 2.595 52 
Q0347-383 (6.4 ± 10.3) x 10"^ 1.13 3.025 68 
Q0405 + Q0347 (8.5 ± 5.7) x 10^^ 1.06 2.811 120 
Q0528-250(A) n/a n/a n/a n/a 



3-4.3.1 Bootstrap verification 

We have assessed the results of the reduced redshift plot using a resampling bootstrap 
method (Press et al., 1992). The resampling bootstrap method proceeds as follows: i) 
from the set Q and Ki values for a particular absorber (with n d/Ki pairs), generate a 
new set of C,i and Ki values by drawing C^ijKi pairs with replacement from the original 
set, such that the new set of C^ijKi pairs has n points; ii) calculate ^[ijli by fitting a 
linear model to vs /Cj, and keep this value of /S.^/ ^] in) repeat this process 10^ times 
to obtain 10^ values of A/i//x. The mean and standard deviation of this ensemble of 
A/i//i values should be consistent with the results from the original RRM fit to the Qi 
vs Ki values for that absorber. We show the results of this in figures 3.12 and 3.13, and 
note good agreement with the fitted values from table 3.6. The good agreement seen is 
unsurprising given the reasonable number of transitions used; the probability distribution 
of the fitted parameters should be approximately Gaussian because of the central limit 
theorem, combined with the fact that there is not a large range in the magnitudes of the 
uncertainty estimates for the redshifts for different transitions. 

3-4.4 Comparison with the results of Reinhold et al. 

To directly compare our results with that of Reinhold et al. (2006), we performed an 
analysis where we utilised the same transitions used in that work. For Q0405— 443, this 
removed 16 transitions and adds 3, the latter of which we initially decided were badly 
contaminated and excluded from our main analysis. For Q0347— 383, we removed 35 
transitions and added 4. We used the RRM so as to yield a like-with-like comparison. 
The results of this are set out in table 3.7. It is difficult to compare our results directly 
with those of Reinhold et al. in a statistical fashion, because the results are derived from 
the same spectra, and therefore the data are not independent. However, in both cases we 
see a shift toward A^/^ = 0. Although the inclusion of the result from Q0528— 250 shifts 
a combined Q0405— 443 + Q0347— 383 result toward zero, our result based on the same 
transitions used by Reinhold et al. is null. 
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Figure 3.12: Bootstrapped distribution of A/x//x for the reduced redshift method (RRM) 
for the z = 3.025 H2 absorber toward Q0347— 383. The histogram shows the distribution 
of values of A^/^u from 10^ resamplings. The red curve shows a theoretical Gaussian 
distribution with the same mean and standard deviation as the bootstrapped samples. 
The horizontal bar indicates the \a confidence interval from the Gaussian. The good 
agreement between the curve and the histogram shows that the results are well described 
by a Gaussian with A/i//i = (6.4 it 9.6) x 10^^. 

Table 3.7: Reduced redshift method (RRM) constraints on A////x for Q0405-443 and 
Q0347— 383 based on the same transitions utilised by Reinhold et al. (2006). The use of 
the same transitions therefore indicates whether the differences between our results and 
that of Reinhold et al. are due to the differing transition list or due to some other factor. 
We see that A///// from our fits is significantly different to the results of Reinhold et al. 
on this basis, and is consistent with our results presented in table 3.5. Therefore, the 
difference between our results and that of Reinhold et al. is not due to the transitions 
used. 



Quasar spectrum Zabs A;u//i — our analysis A/i/^u — Reinhold et al. 



Q0405-443 2.595 
Q0347-383 3.025 
Combined result 2.81 



(10.2 ±8.9) X 10-6 
;i2.0±14.0) X 10^6 
(10.7 ±7.5) X 10-6 



(27.8 ±8.8) X 10-6 
(20.6 ±7.9) X 10-6 
(23.9 ±5.9) X 10-6 



As noted in section 3-2.1.2, the modelling of the Lyman-a forest should not cause appre- 
ciable deviations from a more simplistic treatment of the background continuum over a 
large sample of transitions. We experimented with this by performing simple, linear fits 
to the background Lyman-a flux in the vicinity of certain molecular hydrogen transitions 
and found this to be true. Therefore, we do not believe that the deviation of the result of 
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Figure 3.13: Bootstrapped distribution of ^[ij [i for the reduced redshift method (RRM) 
for the z = 2.595 H2 absorber toward Q0405— 443. The histogram shows the distribution 
of values of ^lij from 10^ resampHngs. The red curve shows a theoretical Gaussian 
distribution with the same mean and standard deviation as the bootstrapped samples. The 
horizontal bar indicates the \a confidence interval from the Gaussian. The bootstrapped 
distribution is mildly left-skewed, with excess skewness ~ —0.4. The Gaussian is described 
by A/i/;[x = (9.5 it 9.1) x 10~^; the actual \o confidence limits from the boostrap is 
l^\Ji>l G [0.7,18.2] X 10~^. The mode of the distribution coincides approximately with 
the fittted value given in table 3.6. Further investigation shows that the skewness arises 
from the presence of a few high statistical precision points at long wavelengths. Given the 
good agreement beetween empirical confidence limits from the bootstrap and the analytic 
uncertainty values on A^/^, the mild skewness seen here is of no practical consequence. 



Reinhold et al. from ours is due to inadequate modelling of the forest. Instead, we ascribe 
the difference between our results to a combination of the better wavelength calibration 
of our spectral data and the fact that the Reinhold et al. result is dominated by a few 
points with j < which have particularly small error bars. 



3-5 Discussion of results 



3-5.1 Further investigations of Reinhold et al. (2006). 



Other researchers have attempted to replicate the findings of Reinhold et al.: 
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• Wendt & Reimers (2008) examined the Q0347-383 and Q0405-443 absorbers to 
investigate the results of Reinhold ct al., and found that |A/i//i| < 4.9 x 10~^ (95 
percent confidence). 

• Thompson et al. (2009) also analysed the Q0347— 383 and Q0405— 443 absorbers to 
investigate the results of Reinhold ct al. and those of King et al. (2008) (the results 
presented above). They found that A/i//x = (—7 it 8) x 10~^, which is inconsistent 
with the results of Reinhold et al. 

• Wendt & Molaro (2010) re-investigated the Q0347— 383 absorber, using additional 
data from program ID 68.B-0115(A), taken in 2002. Rather than co-adding spectra, 
as is traditionally done, they fitted each of the exposures simultaneously. For each 
exposure, they determined a velocity offset with respect to the other exposures 
by maximising spectral cross-correlation. The exposures were then shifted onto 
a common wavelength scale. They noted an average inter-spectrum wavelength 
deviation of 2.3mA or 170 ms~^ at 4000A. The Ly-a absorption against which the 
H2 transitions is seen was fitted as a polynomial. From an initial set of 52 lines, 
they examined the effective wavelength difference between the wavelengths of each 
transition between the two data sets. They noted that only 36 lines have a difference 
between the two sets of exposures of less than 3a; some lines deviate by more than 5a. 
They rejected lines which differ by more than 3a, leaving 36 lines for analysis. They 
concluded that Afi/ fi = (15ib9stati6sys) x 10^^, where the estimate of the systematic 
comes from increasing the error bars to make the fit statistically consistent. They 
suggest that the excess scatter seen in a plot of Q vs Ki may be due to the way in 
which they have approximated the Ly-a flux with a polynomial, which accords well 
with our earlier arguments that not modelling the forest structure appropriately will 
necessarily cause under-estimation of the statistical error on A/x//i. 

All of these results are consistent with the analysis of Q0405— 443 and Q0347— 383 pre- 
sented in section 3-4, which is reassuring. The somewhat tighter confidence limits we 
obtain compared to these works relate to the fact that these works have attempted to be 
more conservative in analysing the spectra. However given that the xt — 1-06 for our 
simultaneous RRM fit of Q0405— 443 and Q0347— 383 we do not think that our confidence 
limits on A/i/// are too small. 

3-5.2 Convergence for Q0528-250 

The Q0528— 250 absorber has a sufficiently high optical depth that the low-J lines are 
commonly saturated. That is, the transitions fall on the flat part of the curve of growth. 
The curve of growth describes the change in the equivalent width of the transition with 
increasing column density (see for example Vardavas, 1993, and refereinces therein). In 
this regime, is relatively insensitive to changes in the column density, which makes 
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accurate determination of the column density difficuh from a single transition. With 
many transitions of different oscillator strengths, one can sample the curve of growth in 
many different places and, in principle, obtain a good constraint on the column density. 
However, we have left the total column density (effectively, the oscillator strength) free 
for each H2 transition. 

Because ^ is relatively flat with respect to N for saturated transitions, convergence to 
the final estimate of the total column density for these transitions can be slow. Thus, 
for a fit to Q0528— 250 we generally see relatively fast convergence of most parameters, 
but then many tens of iterations for which the change in is only a few times larger 
than the stopping criteria. In these iterations, A/i//i changes only very slightly, with 
5(A/x//i) < O.lcj. Thus, although convergence for the values of for some parameters is 
slow, the convergence of A///// is not affected. This relates to the fact that it is the line 
centroiding which determines A////i, and the accuracy of the line centroiding should not 
be markedly affected by reasonable changes in N . 

Nevertheless, it would be preferable if the actual oscillator strengths could be used, as 
this would speed convergence. The final result should also be more reliable because of 
the use of fewer free parameters. We tried to modify our existing model for Q0528— 250 
to utilise the oscillator strengths, and found that in many transitions the model was a 
reasonable match to the data. However, in some transitions we found that the amount of 
absorption was over-predicted by the model relative to the data. In general, this means 
that the local continuum against which the H2 Voigt profile is calculated has been set too 
low. Determination of the true continuum is difficult in the forest, especially in the blue 
end of the forest, due to the fact that there are few regions of spectra with no apparent 
absorption. Given the relative unimportance of accurate determination of the column 
density on the value of A/x//u, we therefore simply retained our model where the total 
column density for each transition remained a free parameter, and did not pursue models 
where the column densities were constrained by the oscillator strengths. We leave this 
avenue to future research. 

3-6 Q0528-250 revisited 

The Q0528— 250 constraint presented in section 3-4 is extremely precise. However, there 
are good reasons to revisit this absorber. Some of the exposures which contribute to 
the Q0528:A spectrum are not well-calibrated. By well-calibrated, we mean that the 
exposures do not have ThAr calibration spectra taken in the same observation block. The 
design of VLT /UVES is such that the position of the spectrograph grating is reset between 
different observation blocks. Although the specification is such that the placement of the 
grating should be good to within 0.1 pixels (D'Odorico et al., 2000), the use of spectra for 
which the ThAr spectra were taken in different observation blocks necessarily introduces 
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wavelength calibration uncertainties into the result. Additionally, the velocity structure of 
the absorber is clearly complicated; further observations should allow better determination 
of the velocity structure. 

A more subtle concern for the earlier observations under program IDs P66.A-0594, P68.A- 
0600 and P68.A-0106 is that they suffer from the fact that the slit width used for the 
observations was often significantly larger than the prevailing seeing conditions. For these 
observations, a 1 arcsecond slit was used in both the blue and red arms. The average ratio 
of the slit width to seeing, where seeing is quantified by the output of the DIMM^ and the 
seeing values were weighted by the duration of the exposure, was 0.78. In one exposure of 
1.6 hours, the ratio was as low as 0.48. In the case where the seeing is significantly smaller 
than the slit width, the instrumental profile will be non-Gaussian, which complicates 
the analysis. Although vpfit allows for a numerically-provided instrumental profile with 
which the Voigt profile model is convolved, we have assumed that the instrumental profile 
is Gaussian. 

As noted earlier, Q0528— 250 was re-observed in late 2008/early 2009 under program ID 
82.A-0087, with exposures totalling approximately 8.2 hours. All science exposures were 
taken with ThAr calibration exposures in the same observation block, meaning that we 
consider these exposures to be well-calibrated. The duration- weighted average of the 
ratio of the slit width to the DIMM average seeing for these exposures is 1.09; in these 
exposures, the slit width is more appropriately matched to the average seeing conditions. 
The exposures from P82.A-0087 were kindly reduced, co-added with the previous exposures 

and cleaned using UVES popler by M. Murphy in a similar fashion to the previous 

spectra.. The co-added spectrum was then split into the new exposures (the combination 
of those from P82.A-0087) and the old exposures. The advantage of doing this is that 
the use of more data provides a better constraint on the quasar flux continuum. We refer 
to the new spectrum generated from the older exposures as Q0528:B1, and the spectrum 
generated from the exposures from P82.A-0087 as Q0528:B2. 

On account of the considerations relating to the underestimation of flux uncertainties in 
regions of low flux given in section 3-3.3, instead of applying a heuristic model to correct 
the error arrays as was done previously, we applied the method described in section 2-2.2, 
which considers the actual inconsistency of the contributing spectra to each flux pixel 
about their weighted mean. 

Although an analysis of Q0528:B1 is effectively a re-analysis of Q0528:A, as the con- 
tributing exposures are the same, the use of a different correction for flux uncertainty 
under-estimation appreciably affects the final spectrum. Thus it is useful to compare an 
analysis of Q0528:B1 to Q0528:A to check that consistent estimates of A^//i are obtained. 

The additional exposures provide significant extra information with which the velocity 
structure of molecular hydrogen can be constrained, and additionally more information to 
^Differential Image Motion Monitor 



65 



66 



— the proton-to-electron mass ratio 



constrain the structure of the forest. Therefore we have adopted the following approach 
to fitting the spectra. Firstly, we fitted the regions of the Q0528:B2 spectrum containing 
molecular hydrogen afresh. Then, we searched the region of the spectrum above the forest 
for metal line absorbers from any redshift. We modelled these absorbers if they contained 
atomic species which would generate transitions in the forest, and then looked to see if 
transitions from these absorbers were blended with the molecular hydrogen transitions. If 
they were, we rejected these molecular hydrogen transitions from our analysis. This caused 
us to reject a small number of transitions which were in our previous fit (in particular: 
L1R3 and L2R3). Once a satisfactory model was achieved, we then applied the same model 
to the Q0528:B1 spectrum, in order to attempt to achieve a like-with-like comparison. We 
then independently refined the models for the Q0528:B1 and Q0528:B2 spectra. The 
models for the two spectra differ somewhat, due to the different SNR in different spectral 
regions. 

Additionally, HD (deuterated molecular hydrogen) was detected in this absorber with a 
column density of logj^g ^ — 13.27 it 0.08. HD is sensitive to a change in fi, and therefore 
here we include HD in our analysis of Afi/fi. Although HD should display a similar 
velocity structure to H2, the low optical depth and the small number of transitions observed 
means that any such structure is unresolved. We therefore model the HD absorption with 
only a single velocity component. That is, the constraint on A/i//i from HD is derived 
only by considering potential velocity shifts of the HD lines with respect to each other, 
and not with respect to molecular hydrogen. Malec et al. (2010) have collated oscillator 
strengths, laboratory wavelength values and Ki values for HD; wavelength values are from 
HoUenstein et al. (2006) and Ivanov ct al. (2008), Ki values are from Ivanov ct al. (2008) 
and oscillator strengths were calculated by Malec et al. (2010) from Einstein A coefficients 
given in Abgrah & Roueff (2006). 

3-6.1 Transitions used 

The transitions used in our re-analysis of Q0528— 250 differ somewhat from our earlier 
analysis. The transitions used are set out in table 3.8. In figure 3.14 we give the relation- 
ship of Ki and J with Aq for the transitions used in our analysis. 

3-6.2 Velocity structure &z Afi/fi results 

We re-examined the question of the velocity structure using the spectrum Q0528:B2 under 
different scenarios. To do this, we considered three and four velocity components. In each 
case, we applied a model in which corresponding components in transitions with the same 
J had the same 6-parameter (6 = -F[J]), and also the scenario in which corresponding 
transitions had the same 6-parameter regardless of J (6 7^ -^["^D- These models are not 
nested, and so we consider only the AICC rather than using the F-test. We give the AICC 
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Table 3.8: Transitions used in our re-analysis of Q0528— 250. n gives the number of 
transitions used. 



Quasar spectrum Zabs n Transitions used 

Q0528-250:B2 2.811 76 LORD, LORl, LlPl, L1P2, LIRO, LlRl, L1R2, L2P1, 

L2P2, L2P4, L2R2, L3P2, L3P3, L3P4, L3R3, L3R4, 
L4P2, L4P3, L4P4, L4R2, L4R3, L5P2, L5P3, L5P4, 
L5R3, L5R4, L6P3, L6P4, L6R3, L7P3, L7P4, L7R2, 
L8P2, L8P3, L9R2, L9P2, L9P3, L9R3, LlOPl, 
L10P2, L10P3, L10P4, LlORl, L10R3, L11P3, 
L11R2, L11R4, L12R0, L12R3, L13P1, L13P3, 
L13R2, L13R3, L16R3, L15P3, L15R2, L15R3, 
L16P1, L16P2, L16R2, L17P4, L17R3, W0P2, 
W0Q3,W0R2, W0R4, W1R2, W1R3, W1Q2, W2P3, 
W2R2, W2R3, W3Q4, W4P2, W4P3, W4Q3, 
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Figure 3.14: Relationship of Ki and J with Aq for the transitions used in Q0528:B2. 
Upper panel: the sensitivity coefficients, Ki, for the transitions used in our analysis of 
Q0528:B2 (dark blue points) and not detected or not fitted (grey points). Lower panel: 
the distribution of transitions with wavelength according to their J-level. 



for these scenarios, and the resulting values of A/i//i, in table 3.9. We note that we were 
unable to obtain a stable fit for a 4-component model where 6-parameters for components 
were forced to be the same for all J-levels. In this model, the column density of one of 
the components in the J = 1 transitions was driven down below the detection threshold 
of = lO^cm"^. This component was the second strongest component in the J = 2 
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and J = 3 transitions. Although we could have omitted this component, the substantial 
differences in relative strength between the different components in the different J-levels 
means that this model is very unlikely to be a good model of the physical situation, and 
therefore that the value of A/i//x derived is unlikely to be accurate. 

Table 3.9: Analysis of the velocity structure of Q0528— 250 using the spectrum Q0528:B2. 
n gives the number of components. The second column defines whether the 6-parameter for 
different components is fixed or is different for transitions with different J. The column 
AAICC shows the difference of the AICC with respect to the best-fitting model. For 
the 4-component, b ^ FiJ) model, the column density of one component of the J = 1 
transitions was rejected, where this component was strongly detected in other J-levels. 
This implies that the model is not physically realistic, and so we label it as unstable. 



n 


Relationship between b and J 


AICC 


AAICC 




A/i//i (10-«) 


3 


b = F{J) 


11488.2 





1.115 


0.2 ±3.2 


3 


b / F( J) 


11653.8 


165.6 


1.141 


0.4 ±3.2 


4 


b = F( J) 


11510.4 


22.2 


1.117 


3.4 ±3.7 


4 


b / F{J) 


Unstable fit 


n/a 


n/a 


n/a 



We immediately note the sign change from the results in section 3-4.3, although in n = 3 
case the result is only marginally different from zero. It is clear from the n = 3 results that 
a model where different J-levels have different b parameters is preferred very strongly over 
a model with the same 6-parameter for each J-level. This accords well with our findings 
from the earlier analysis. 

We note that in this case the 3-component model is preferred to the 4-component model, 
at about the same statistical significance as the 4-component model was preferred to the 
3-component model in our analysis of Q0528:A. There are several points to note here: 

1. The seeing conditions for the earlier spectra were quite variable, and may have 
induced a significantly non-Gaussian instrumental profile^. The requirement for 4 
components earlier may be a reflection of the non-Gaussian profile, rather than the 
absorber itself. Because the components are unresolved at these resolving powers, 
identification of the correct number of components is difficult. 

2. Some of the exposures contributing to the earlier spectrum were poorly calibrated. 
It is conceivable that wavelength miscalibrations could cause a model with more 
complexity to be favoured. 

3. Although we have attempted to apply the same forest model in analysing the 3- and 
4-component model (although where in each case is obviously minimised with 
respect to all the parameters), the construction of the forest model itself depends on 
the choice of the II2 model. Strong II2 components will obviously have little effect 

^Although the instrumental profile in this case might be non-Gaussian, the profile should remain sym- 
metric. 
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on the forest model because they are clearly distinguished from the forest. However, 
the 4th component of the model is weak and unresolved visually. We created the 
3-component fit by removing the weakest component from the 4-component fit. In 
most regions, the resulting fit was reasonable, however in a small number of regions 
we found that we had to add weak forest components to account for the removal of 
the 4th H2 component. To achieve a like-with-like comparison, we included these 
extra forest components in the 4-component fit. This means that any test for the 
statistical significance of the number of components depends somewhat on the choice 
of forest model near the H2 lines. 

4. We noted whilst we were iteratively refining the model in the 3- and 4-component 
cases that the model for the 4-component model was preferred for most of the refining 
process, with AAICC ~ 10 in favour of the 4-component model for much of the time. 
It was only in the last few rounds of refining the model that the 3-component model 
became preferred as a result of changes made to a small number of regions. Therefore, 
the choice of the correct number of components can be sensitive to decisions made 
about the forest model in a small number number of regions. 

We conclude from this that although the Jeffreys' scale suggests that there is very signifi- 
cant evidence for the 3-component model over the 4-component model, in light of the fact 
that this evidence is conditioned on the correct choice of forest model and instrumental 
resolution, the actual preference for the 3-component model over the 4-component is rather 
weaker. These arguments apply similarly to our earlier preference for a 4-component model 
over a 3-component model. The issue of whether there are 3 or 4 components is simply 
very difficult to resolve given the actual SNR and resolution of the spectra available. 

On the basis of the statistical results in table 3.9, we choose A/i//i = (0.2 it 3.2) x 10~^ as 
our preferred statistical result for the analysis of Q0528:B2. This is obviously consistent 
with no change in fi. 

3-6.3 Consistency checks 

We can relax some of the assumptions made on our analysis of this absorber to explore 
whether they have a meaningful impact on the result. In particular, we explore here 
whether the result we obtain is significantly affected by our assumptions about the different 
J-levels. We follow a similar procedure to that used by Malec et al. (2010). 

3-6.3.1 Different A/i//i from different J-levels 

Rather than allowing transitions from all J-levels to contribute to a single value of Afi/fi, 
within VPFIT we can calculate a value of ^li/ 11 for each J-level (and one for HD separately) . 
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Strictly, the values of A/i//i obtained are not independent because: i) they assume that 
each J- level has the same number of components; and, ii) the redshifts are tied between 
corresponding components in transitions arising from different J-levels. Nevertheless, this 
is useful for quantifying the contribution that each J-level makes to the final result. Ubachs 
et al. (2007) noted that, on account of the para-ortho distribution of the J = 1 state 
is significantly populated even at low temperatures. They suggested dividing the states 
into a J G [0, 1] set (cold states) and J > 2 (warm states) to examine the impact of 
temperature. We examine both of these cases in figure 3.15. We see that there is no clear 
evidence for a difference of A^//i obtained using transitions arising from different J-levels. 
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Figure 3.15: Left panel: A^//i for each H2 J-level and HD, assuming that each J-level 
has the same 3-component velocity structure. The J = level has a substantially larger 
uncertainty than the J G [1,4] transitions because only 3 J = transitions are used in the 
fit. HD has been plotted in units of 10~^ to increase clarity for the H2 results. Right panel: 
A/i//i for two groups of transitions, H2 J € [0, 1] ("cold transitions") and J € [2,4] ("warm 
transitions"). The black data points are the results where the redshifts for corresponding 
components in the cold and warm transitions are assumed to be the same, and the red 
points are where the redshifts are allowed to differ between the cold and warm transitions. 



3-6.3.2 Other consistency checks 

We checked that our derived value of A/x//z was not unduly affected by the starting guess 
that Ayu//^ = by starting the optimisation with A^u/^u = 10~^. The final value of A/i/// 

*In ortho-hydrogen the proton spins are parallel, whilst in para-hydrogen the spins are anti-parallel. 
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under this circumstance was 0.210 x 10~^, compared to 0.187 x 10~^ when started from 
H = 0. These two numbers differ by ~0.007cr, which is entirely negligible. This 
demonstrates both that the final result is insensitive to reasonable choices of the starting 
guess for A/i//x and that our optimiser is functioning adequately. 

Other checks are possible. One could divide the fit into several pieces (such as fitting odd- 
and even-numbered regions separately). Estimates of Afi/fi derived from these sub-fits 
should be consistent with each other. However, unless one divides the fit into a very 
large number of sub-fits (each of which has only a small number of transitions fitted) it is 
unlikely that significant deviations will be detected on account of the central limit theorem. 
The limiting case of this method is where one considers the impact of each transition on 
An/fj,. In the RRM this is easily handled by inspecting the residuals of the Q values 
about the linear model for Q vs Ki. We cannot apply the RRM to Q0528— 250 because 
of the complicated velocity structure. However, one could use a jack-knife approach. In 
this method, if n transitions are fitted, then one generates n new fits, where in the ith fit 
one removes the ith H2 transition. One can then inspect the distribution of the n Afi/n 
values obtained in this way to search for values which deviate strongly from the average; 
such deviation implies that the fit is being strongly affected by the transition involved. 
This could mean either that the transition is yielding a very precise constraint on the line 
centroids or that the transition is an outlier; further work is required to determine which 
of these possibilities is relevant. However, this requires a substantial amount of computing 
time, and so we did not implement this check. 

3-6.4 Systematic errors 

Whilst for the results given in section 3-4.3 we provided only statistical errors, given the 
high statistical precision it is appropriate to attempt to estimate the impact of systematic 
errors. Malec et al. (2010) noted possible contributions to the systematic error budget, 
which include: i) known wavelength calibration errors due to uncertainties in the ThAr 
wavelength calibration; ii) intra-order wavelength distortions of unknown origin; in) the 
effect of velocity structure decisions, and; iv) the effect of re-dispersion of the spectra. We 
consider each of these in turn. 

3-6.4.1 Known wavelength calibration errors due to uncertainties in the ThAr 
calibration 

The calibration of the ThAr wavelength scale is not perfect; each of the ThAr transitions 
displays a residual velocity offset about the best-fit polynomial solution. The RMS of the 
residuals is ~ 70ms~^ in the blue arm and ~ 55ms~^ in the red arm (M. Murphy, priv. 
communication). However, these fluctuations are random, and therefore will average out 
if a large number of H2 transitions are used. Only systematic deviations from the true 
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wavelength sohition should appreciably affect the best estimate of A^u/^u. There are fewer 
good ThAr lines in the blue end of the spectrum than in the red end, and therefore larger 
deviations of the wavelength solution from the true solution are possible. The systematic 
deviation in the blue end of the spectrum relative to the red end of the forest has an upper 
limit of ~ 20ms~^ (M. Murphy, priv. communication). The maximum Ki value used in 
the fit is 0.053, whilst the minimum is —0.009. This implies that the maximum possible 
systematic due to this effect is given by (5(A/i//i) = (Aw/cj/Ai^Tj, which is 1.1 x 10^^. 
In reality, the effect is likely to be smaller than this as positive deviations should tend to 
cancel somewhat with negative deviations. However, how to reduce the effect is unclear; 
it may not simply scale as 1/^/N. Therefore, we retain this estimate as the maximum 
possible systematic effect due to this cause. 

3-6.4.2 Intra-order distortions of unknown origin 

The path that the quasar light takes through the telescope is similar but not identical 
to that from the ThAr calibration lamp — the ThAr light fills the slit nearly uniformly, 
whilst the quasar light does not. Due to the different light paths, the wavelength scale of 
the quasar light may be different to that of the ThAr light; the differences between them 
appear as an apparent distortion of the wavelength scale. Both long range and short-range 
distortions are possible. 

Griest et al. (2010) identified a pattern of distortion within echelle orders in Keck/HIRES 
spectra, such that the wavelength scale at the centre of echelle orders is distorted with 
respect to that at the echelle order edges. The peak-to-peak velocity distortion is ~ 
500 ms~^ at ~ 5,600A. The distortion was identified by comparing the calibration of a 
spectrum obtained using a ThAr exposure to that obtained using an I2 absorption cell. The 
iodine cell is placed in the quasar light path, and the characteristic absorption spectrum 
is imprinted on the quasar spectrum. The use of an iodine cell therefore obviates the 
concern about optical path differences when using a ThAr lamp. Unfortunately, an I2 cell 
is not useful for calibration of general quasar observations, because the iodine transitions 
cover only a relatively narrow part of the optical range, and because of the loss of flux 
from the quasar as a result of the use of the cell. The observed distortion pattern appears 
to be dependent on wavelength, and the distortion may be larger at longer wavelengths. 
The precise origin of the distortions is unknown, and similarly it is unknown to what 
extent the distortions remain constant in time, and how they depend on extrinsic factors 
(e.g. telescope orientation, temperature, pressure and accuracy of quasar centering in the 
spectrograph slit). Therefore, it is not possible at present to adequately remove these 
distortions of the wavelength scale from observations. 

Whitmore et al. (2010) identified a similar effect in VLT/UVES spectra, with a peak-to- 
peak velocity distortion of ~ 200 ms~^. The distortion appears to be much less consistent 
between echelle orders than that seen by Griest et al., however. Further observations have 
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shown that the observed wavelength distortion is definitely not constant over long periods 
of time i.e. more than several nights (M. Murphy, priv. communication), which makes 
removal of the distortion extremely difficult. 

Similar to Malec et al. (2010), we attempted to estimate the magnitude of the error 
introduced into a determination of A/i//i as a result of the observed velocity distortions. 
To do this, we used a triangular-shaped distortion, where the wavelengths of pixels at the 
centre of echelle orders were displaced by -1-200 ms~^ with respect to those at the echelle 
order edges. The modification to the spectra was kindly implemented by M. Murphy 

within UVES popler. The shift in A^/^ after modifying the spectrum was —0.3 x 10~^. 

Clearly this value is model-dependent — if the distortion has a different amplitude or form, 
then the impact on A/x//i may be different. However, this estimate of the systematic is 
likely to be of the correct magnitude. We therefore adopt a Gaussian with a = 0.3 x 10~^ 
as an estimate of the systematic effect due to distortions of this type. 

3-6.4.3 Velocity structure & spatial segregation 

As noted earlier, it is possible that transitions arising from different J- levels might be 
spatially segregated (Jenkins & Peimbert, 1997; Levshakov et al., 2002). Assuming that 
all J-levels arise from the same redshift in this event could spuriously produce A/i/fi ^ 
0. Although in all of our analyses the results are statistically consistent with zero, it 
is of course possible that a non-zero A/i//i could be pushed towards zero by this sort 
of systematic effect. Similar to Malec et al. (2010), we relaxed our assumption that 
corresponding components in all J-levels arise from the same redshift. In particular, we 
divided the data set into "cold" transitions, J G [0, 1], and "warm" transitions, J € [2, 4], as 
was done earlier, but only tie the redshifts of the different J-levels within these two groups. 
If there is spatial segregation, this should be seen as a statistically significant difference 
between the redshifts of corresponding components between the two groups, and also as 
a substantial shift in the values of Ayu/ju derived from the two groups compared to what 
was obtained earlier. 

In the right panel of figure 3.15 we directly compare A/i//i in the case where the velocity 
structure was allowed to vary between cold and warm components, and note that there 
is no appreciable shift. In figure 3.16 we show these considerations more directly by 
examining the differences in the redshifts of the three components, and also the explicit 
difference between A/x/^ in the two cases considered. We see that there is no statistically 
significant difference between the redshifts in any of the three components. Similarly, 
we see that the shift in A/x//i is < O.lo", which we consider to be effectively negligible. 
Thus, we conclude that there is no evidence for a systematic shift in A/i/fi as a result of 
segregation of the cold and warm J-levels. 

Nevertheless, to quantify the possible error introduced by our assumptions regarding ve- 
locity structure, we examine the actual shifts in A/j,/ /x. The shift in Afi/fi for the J G [0, 1] 
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Figure 3.16: Left panel: Change in the redshifts of components for the H2 fit for the 
3-component model to Q0528:B2 when the velocity structure was allowed to vary be- 
tween the "cold" and "warm" components. The difference for each component is de- 
fined as z{J £ [0,1]) — z{J £ [2,4]). Right panel: The difference in IS.^/^ when the 
velocity structure was allowed to vary, defined as A/i//i(structure allowed to vary) — 
A/i//i(structure not allowed to vary). The error estimate is calculated as the mean of the 
error estimates in the two cases considered; the errors do not differ appreciably between 
the two cases. 



levels is 0.8 x 10~^ and for the J G [2,4] levels is 0.2 x 10^*^. We thus take 0.8 x 10"^ as 
an estimate of the potential error introduced into our analysis due to assumptions about 
the velocity structure in order to be conservative. 



3-6.4.4 Re-dispersion of spectra 

The spectrum used is the result of the co-addition of exposures taken with different echelle 
grating settings. During the co-addition, the spectra are placed on a common wavelength 
grid. Because the spectra are re-binned, the choice of wavelength grid introduces correla- 
tions between neighbouring pixels. More importantly, the choice of the wavelength grid 
has the potential to affect A/x//x. To investigate this, we examined the effect of shifting 
the wavelength grid by —0.2, —0.1, 0.1 and 0.2 pixels. The modification of the spectra 

was kindly implemented by M. Murphy within UVES POPLER. The shifts this caused on 

A/i//i respectively are -1.3 x 10"'^, -2.1 x 10"'^, -hl.O x 10^'' and -hO.2 x lO"'^ respectively. 
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The standard deviation of these values is ~ 1.4 x 10~^, and so we adopt 1.4 x 10~^ as an 
estimate of the potential error in A/i//i on account of the re-dispersion of the contributing 
exposures. 

3-6.5 Result including systematic errors 

In table 3.10, we accumulate the potential systematic errors from our discussion above 
and give our final estimate of A/i//x including the systematic component. Although the 
distribution of systematic errors is likely to be Gaussian in many cases, the impact on 
A/u//U arising from the distortion in the wavelength scale (for instance) is an upper limit. 
The probability distribution of the sum of random variables is given by the convolution 
of their individual probability density functions. Thus, to estimate our final uncertainty, 
we convolve the distributions assumed for each of the sources of uncertainty, and give the 
standard deviation of the resultant distribution as our uncertainty estimate. 

This yields our final estimate of Afj,/ fi for Q0528:B2, as 

^ = (0.2 ± 3.2stat ± 1.9sys) X 10"^ = (0.2 ± 3.7) x 10"^ (3.7) 

3-6.6 Q0528:B1 

At the time of writing, our analysis of Q0528:B1 is incomplete. However, preliminary 
results suggest that the value of A^u/^u is likely to be very similar to that obtained with 
the earlier spectrum (Q0528:A). 

3-7 Discussion and Summary of results 
3-7.1 Summary of results 

In this chapter, we presented analyses of high quality spectra of the highly-studied quasars 
Q0347-383, Q0405-443 and Q0528-250 (Q0528:A). In our initial investigation of these 
absorbers, we applied the DCMM to all three absorbers, and the RRM to the absorbers 
towards Q0347— 383 and Q0405— 443 (the RRM cannot be applied to the absorber towards 
Q0528— 250 because of the overlapping velocity components). Our preferred results for 
the absorbers towards Q0347— 383, Q0405— 443 and Q0528— 250 from the analysis of these 
spectra are derived from the DCMM and are Afi/^i = (8.2±7.5) x 10"*^, (10.1±6.6) x 10^^ 
and (—1.4 it 3.9) x 10^^ respectively. 
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The spectra for these quasars used were obtained from the VLT/UVES archive. We 
discussed potential problems with the exposures contributing to Q0528— 250 in section 
3-6, and why analysis of a new spectrum would be advantageous. We performed this 
analysis on a new spectrum generated from exposures taken specifically for the purpose 
of measuring A/i//i in 2008 and 2009. From this spectrum we obtained the constraint 
/^H/H = (0.2±3.2stat±1.9sys) X 10-6 = (0.2±3.7) x lO'^ (Q0528:B2), where the systematic 
error estimate is dominated by systematic distortions of the ThAr wavelength scale and 
the process whereby individual exposures are co-added onto a common wavelength grid. 

The value of A/i//i derived from Q0528:B2 is likely to be more accurate than that derived 
from Q0528:A because the exposures were taken with the specific purpose of searching for 
potential variation in /x, and should have superior wavelength calibration. Nevertheless, 
the values of A///// derived from Q0528:A and Q0528:B2 are statistically consistent, and 
so it is difficult to demonstrate any marked inaccuracy in the value from Q0528:A. 

All of these results are consistent with each other, and with no cosmological variation in 
^. Taken together, these results are the best z > 1 constraints on cosmological evolution 
in /i. A weighted mean of these results yields {An/n)^ = (1.7 it 2.4) x 10~^. 

3-7.2 Comparison with other results 

In figure 3.17 we show all current extragalactic constraints on A/x//u^. Assuming that all 
of the high-redshift A/x//x points are well-described by a single value of Afi/fi, one can 
calculate the weighted mean of these points, which gives (A/x//i)^ = (2.2 it 2.2) x 10^®. In 
calculating this weighted mean, we have added statistical and systematic error estimates 
in quadrature where they are available. Xu ~ ^-88 about this weighted mean, giving no 
evidence of excess scatter in the data (and therefore unmodelled systematic trends). 

We return to the question of whether a weighted mean model is appropriate in chapter 6. 
3-7.3 Q0528-250 

The investigation of the z = 2.811 absorber toward Q0528— 250 is extremely challeng- 
ing. The additional spectrum of Q0528:B2 is important in that it was specifically taken 
for the purposes of investigating Afi/fj,. That is, ThAr exposures were taken so as to 
maximise wavelength accuracy. Nevertheless, the different components of the molecular 
hydrogen transitions are poorly resolved, making accurate determination of the structure 
difficult. Although investigations at higher SNR will incrementally aid this process, a 

^We have ignored here the recent analyses of Wcndt & Reimers (2008), Thompson et al. (2009) and 
Wendt & Molaro (2010), as the data used by these studies is not independent of the results of this chapter 
due to the use of common spectra. 
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Figure 3.17: Current best extragalactic constraints on A////i. The resuhs of this work 
are shown as circles, the J2123— 0050 constraint of Malcc ct al. (2010) is shown as a square, 
and the two ammonia results of Murphy ct al. (2008a) and Henkel et al. (2009) are shown 
as triangles. The two Q0528— 250 points at z = 2.811 have been slightly offset for clarity. 
The red line shows the weighted mean of the high-redshift constraints on Afi/ fi, and the 
blue dashed regions show the la confidence limit on the weighted mean. 



more fruitful approach would be to obtain a spectra at significantly higher resolving pow- 
ers {R > 100,000). With VLT/UVES, this would require slit widths of < 0.3 arcseconds 
(D'Odorico et al., 2000). This would result in an unacceptable loss of flux, unless an 
image slicer is used. The use of an image slicer makes the profile of the quasar light in 
the spatial direction more complicated on the spectrograph CCD, which makes extraction 
of the spectrum more complicated. However, for a fixed-aperture telescope, increased R 
comes at the expense of reduced SNR per pixel. Ultimately, routine observations at such 
high resolving powers may have to await the next generation of large telescopes, for which 
suggested spectral resolutions may be ~ 150,000 (Pasquini et al., 2008). Until the in- 
dividual velocity components can be resolved, it may be difficult to ascertain the precise 
nature of the uncertainty due to velocity structure mis-specification. On the other hand, if 
the inter-component velocity spacing is comparable to the intrinsic velocity widths of the 
transitions, then observations with higher resolving powers will not yield such large gains. 
Nevertheless, the higher SNR per pixel and higher resolving powers that will be available 
with future telescopes should make the analysis of complicated H2 absorbers more reliable 
than it is at present. 
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3-7.4 Future work 

With the addition of the z = 2.059 molecular hydrogen system toward J2123— 0050 (Malec 
et al., 2010) there are now five high-quality, independent measurements of A///// derived 
from H2. An analysis of the z = 2.059 absorber J2123— 0050 using VLT data is being 
undertaken by Freek van Weerdenburg (Vrije Universiteit), and should be available shortly. 
However, to constrain A///// adequately at high redshift, it will be imperative to increase 
the number of absorption systems utilised along different sightlines, in order to confidently 
map out A/i//i in different locations and earlier times. 

As noted earlier, there is a clear lack of molecular hydrogen absorbers at high redshift 
which can be used to constrain A/x//u. Malec et al. (2010) noted that despite the large 
number (> 1000) of damped Lyman-a systems known (Prochaska & Wolfe, 2009), the 
number known to contain molecular hydrogen is less than 20. Ultimately, the paucity of 
absorbers may mean that constraining A|u//U may be done more rapidly through methods 
which constrain combinations of fundamental constants (see chapter 6) . 

The process of extracting values of A;u//U from the spectra given the necessity of modelling 
the Lyman-a forest is extremely time-consuming when done manually owing to the large 
amount of time required to optimise parameter values after each trial modification of the 
fit. The continual progress in computing speeds has rendered this process substantially 
easier even since this work was started. Nevertheless, it would be ideal for the number of 
A/i//i measurements from H2 to be increased by at least an order of magnitude. This will 
likely require many PhD students and much patience, but ultimately a move to automated 
methods is likely. Given the time required for the optimisation, manual inspection and 
alteration of the fit at each step easily dominates any simple automated method at present. 
Nevertheless, more clever approaches to automated fitting and better computing power 
may yield progress in this respect sooner rather than later. 
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4-1 Introduction 

The fine-structure constant, a, is an extremely important fundamental constant with a 
rich history. In S.I. units, 

= 

47reo^c' 

where e is the electron charge, eg is the electric permittivity of free space, h is Planck's 
constant divided by 27r, and c is the speed of light. In cgs units, a = /he. The CODATA 
2006 recommended value of a is 1/137.035 999 679(94) (Mohr et al., 2008). a may be 
measured in many different ways, but the most precise derive from measurements of the 
magnetic anomaly of the electron and muon, where the magnetic anomaly is defined as 
a = (5 — 2)/2, and g is the spin (^-factor of the particle in question, combined with quantum 
electrodynamics (QED) calculations. The CODATA value is unfortunately significantly 
affected by a significant error in the QED calculations of Gabrielse et al. (2006), who 
contributed the most precise point in the CODATA analysis. Gabrielse et al. (2007) 
updated their result to a = 1/137.035 999 070(98) after correcting for this error. A more 
recent CODATA value is not yet available. The current most precise bound on a derives 
from Hanneke et al. (2008), who give a = 1/137.035 999 084(51) based on the electron 
magnetic anomaly, a may also be determined from the recoil effects of cold ensembles of 
Rb (Clade et al., 2006; Cadoret et al., 2008) and Cs (Gerginov et al., 2006) atoms; these 
experiments are less precise by an order of magnitude but give results which are effectively 
independent of QED calculations (Salumbides, 2009). 

In quantum electrodynamics, a represents the strength of the coupling between the elec- 
tron and the photon, and therefore determines the effective strength of the electromagnetic 
force. The fact that QED is practically useful derives from the fact that a < 1, which 
makes a perturbative expansion of the effect of QED in powers of a possible. In fact, the 
value of a given above is the low-energy value, a is a running coupling constant, the value 
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of which changes depending upon the energy scale being probed; at the mass scale of the 
Z boson, a increases to a ~ 1/129 (Okun, 1996). When we discuss evolution in a, we are 
referring to evolution in the low-energy value of a. 



4-1.1 Sensitivity of transitions to a change in a 

For an alkali doublet (AD), the separation between the two fine-structure transitions scales 
as (Bethe &: Salpeter, 1977). For a small change in a, Aa/a = (a^ — ao)/ao, (where 
|Aa/a| <C 1), the change in the doublet separation is given by 



Aa c 
^ ~ 2 



(AA), ■ 
(AA)o 



(4.1) 



where (AA), and (AA)o are the relative doublet separations in the cloud rest-frame, at 
redshift z, and in the laboratory (Varshalovich et al., 2000; Murphy, 2002). The con- 
stant c is different for different doublets, and accounts for higher order relativistic effects. 
For the Si ivAA1393, 1042 doublet, c ssl (Murphy, 2002). Considering AD-type transi- 
tions in quasar spectra leads to the alkali-doublet method (AD method). Effectively, one 
compares the observed relative spacings in quasar spectra with laboratory spectra to de- 
termine Aa/a. Because two transitions are being used, Aa/a is not degenerate with the 
determination of the redshift. 

However, the AD method does not make use of all available information in the quasar 
spectra. In particular, different atomic transitions display significantly larger sensitivities 
to a variation in a than the Si iv transitions. If one considers a many-electron atom or 
ion, then the correction to the energy of an external electron due to relativistic effects can 
be written as 



A oc (Z„a)2|^|3/2 



^ C{j,l) 



(4.2) 



U + 1/2 

where Z„ is the nuclear charge, E is the electron energy {E < 0, \E\ is the ionisation 
potential) and j and / are the total and orbital electron angular momenta (Murphy, 2002). 
The quantity C{j, I) determines the contribution to the correction from many-body effects. 
Equation 4.2 immediately suggests two strategies for probing a variation. Firstly, as the 
effect scales with Z^, a comparison of transitions from light ions with those from heavy ions 
should lead to a large difference in the relativistic correction, which is therefore sensitive to 
a change in a. Secondly, the term C(j, /) begins to dominate with increasing j. As a result 
of this, the correction to an s-p transition in a heavy ion will be of the opposite sign to that 
for a d-p transition (Murphy, 2002). Thus the comparison of different types of transitions 
can yield substantial differences in the relativistic corrections. Thus, comparing many 
different transitions from light and heavy atomic species simultaneously can substantially 
increase the sensitivity to a variation in a. This leads to the many-multiplet method (MM 
method). The MM method is described in considerable detail in Webb et al. (1999); Dzuba 
et al. (1999b), and so we present the salient features here. 
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For experimental purposes, one can generally describe how the energy level of a given 
transition varies if a changes, for any multiplet and species. This yields 

uJz = '-^o + qiXz + q2yz, (4.3) 

where ojz is the wavenumber of the transition in the rest-frame of the cloud at redshift z 
(Dzuba et al., 1999b, a, 2001, 2002). Xz and are related to Aa/a, with 



Xz=(-)^-l and y,= (^\-l. (4.4) 

The qi and q2 coefficients account for the relativistic corrections to the energy for a par- 
ticular transition. As we consider only |Aa/a| <C 1, we can amalgamate qi and q2 into 
q = qi+2q2, yielding 

ujz = i^0 + qxz- (4.5) 

The sign and magnitude of q differs significantly depending on the species and transition 
under consideration. Ultimately it is not the actual value of q that constrains /S.a/a^ due 
to the need to simultaneously determine the redshift of the absorber, but the differences 
in the q values between different transitions used. Note that because of the functional 
form of equation 4.5, if Aa/a = then errors in the q coefficients can not manufacture 
an observed /S.a/a 7^ (in the absence of systematic effects). 

Note that 

'azV /Aa\2 /Aa\ /Aa\ , , 

- -1= — +2 — ^2 — , 4.6 
,ao/ \ o. / \ Oi / \ a / 

where the approximation is valid for |Aa/a <C 1|. From equation 4.5, the velocity shift^, 
Av, for a given transition is thus given by 

2cqi fAa\ , /Aa\ 
Av^ ^ = -2cqiXo , (4.7) 



cjQ \ a / \ a 

where qi is the q coefficient for the transition and again |Aa/a ^ 1|. 

We show the effect of a variation on the wavelengths of different MM transitions in figure 
4.1, and show the relationship of the q coefficients with wavelength in figure 4.2. We give 
explicit values for the q coefficients in table 4.1. 

In figure 4.1, one can broadly see the presence of three types of transitions: i) transitions 
with large, positive q, which shift to shorter wavelengths with increasing Aa/a (e.g. the 
Fe II transitions at ~ 2400 and ^ 2600A, and the Zn 11 transitions); ii) transitions with 
large, negative q, which shift to longer wavelengths with increasing Aa/a (e.g. the Fe II 
A1608 transition and the Cr 11 transitions), and; in) those which are relatively insensitive 
to a variation (e.g. the Mg i and Mg 11 transitions). 



^Av/c « AA/A = -Aoj/uj. 
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Figure 4.1: The exaggerated effect of Aa/a on the wavelengths of certain MM tran- 
sitions. Note that the case Aa/a = — 1 corresponds to the non-relativistic case where 
c = oo. Note also the presence of transitions which are relatively insensitive to a variation 
(e.g. Mg II, Mg I, Si II A1526 and Al ii A1670), transitions which are strongly sensitive and 
move to longer wavelengths (e.g. Fe ii excluding A1608, and Zn ii) and transitions which 
are strongly sensitive and move to shorter wavelengths (e.g. Fe ii A1608, Cr ii and Ni ii 
AA1741,1751). 



The Fe ii transitions are very commonly fitted in the MM method, and it is worth con- 
sidering the Fe II A2382 transition (which has the highest oscillator strength of the Fe ii 
transitions) as an example. For Aa/a = -|-10~^, the induced velocity shift in the Fe ii 
A2382 transition (which has q = 1460 cm~^) is ~ — 209ms^^. The instrumental resolution 
for VLT/UVES is typically ~ Gkms""*^, and the pixel width is typically ~ 2kms~"'^. This 
immediately makes it obvious that searching for variations in a at the 10"^ to 10"^ level 
requires very high quality spectra, with good wavelength calibration. 

The MM method has significant advantages over the AD method (Murphy, 2002). In 
particular, the sensitivity gain (with a maximal Aq of ~ 4000) is potentially much larger 
than for the AD method (for which Aq ~ 500 for Si iv), yielding an order of magnitude 
sensitivity improvement in the best case. Additionally, the use of all transitions provides 
a statistical advantage simply through the use of more data. Furthermore, by using many 
different transitions one can better constrain the velocity structure of the absorber, and 
therefore the model is more likely to be a good representation of the physical generative 
processes, and thus the values of Aa/a should be more accurate. Finally, the use of 
transitions with both positive and negative q helps to minimise systematic effects (we saw 
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in chapter 3 how the use of both the Lyman and Werner series played a similar role for 
the analysis of A/u//u using H2). 

A key assumption exists in the application of the MM method that is not required in the 
AD method, namely that the transitions arise from the same location. In the AD method, 
the transitions considered in any particular analysis arise from the same ground state 
and therefore from the same location. However, in the MM method, it is possible that 
transitions from the ground states of different atoms/ions arise from different locations. 
This may occur if the gas clouds containing the relevant species are inhomogeneous. For 
a particular absorber, spatial segregation of the relevant species would lead to a relative 
shift in line positions, which would lead to a spurious detection of a variation in a for 
that system if the spatial segregation was sufficiently large. However, over an ensemble of 
absorbers this could only produce a variation in a if the centres of mass of some species 
were on average located closer to us than the centres of mass of other species along the 
line of sight to the absorbers. This would be an extreme violation of the Copernican 
principle, and therefore we consider this possibility not to be relevant. Although spatial 
segregation cannot bias the results of Aa/a over a sufficiently large ensemble of absorbers, 
it will manifest as excess scatter of the Ao/a points about a fitted model, and therefore 
this effect constitutes a potential random error with expectation value zero but non-zero 
variance. 

4-1.2 Application of the alkali-doublet method 

The strength of the constraint that can be placed on variation in a. directly relates to 
the line widths of the observed transitions used; narrower lines will yield higher precision. 
Savedoff (1956) investigated AD separations in a Seyfert galaxy emission spectra. Bahcall 
et al. (1967) first applied the AD method to absorption lines (which are generally narrower 
than emission lines) which seemed to be intrinsic to the quasar 3C 191, giving Aa/o = 
(-2 ± 5) X 10^2 at z w 1.95. Wolfe et al. (1976) analysed Mg 11 doublets from a DLA at 
z = 0.524. Potekhin & Varshalovich (1994) investigated C iv, N v, O vi, Al iii and Si iv 
doublets to find |(l/a)(Aa/Az)| < 5.6 x 10~^. Cowie &; Songaila (1995) investigated the 
Si IV doublet to find |Aa/Q| < 3.5 x 10^^. 

Varshalovich et al. (2000) applied the AD method to 16 absorption systems toward 6 
quasars using the Si iv AA1393 and 1402 doublet, obtaining a weighted mean of Aa/a = 
(—4.6 lb 4.3) X 10~^ (statistical error only). They suggested that an additional error of 
±1.4 X 10^^ is required due to uncertainties in the laboratory doublet separation (AA)o. 
However, Murphy (2002) noted that this seems optimistic given statements about the 
laboratory accuracy in Ivanchik et al. (1999); Murphy (2002) considered that a systematic 
error of order ~ 5 x 10~^ is needed. 
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Murphy et al. (2001d) analysed 21 Si iv doublets using Keck/HIRES and found that 
Aa/a = (—0.5 it 1.3) x 10~^. The dispersion of their data about the weighted mean yields 
= 0.95, which suggests that the statistical errors are correctly estimated. 

4-1.3 Application of the many-multiplet method 

The MM method was first applied in Webb et al. (1999) to 30 Keck/HIRES quasar spectra 
in the redshift range 0.5 < ^abs < 1-6, where they fitted the Mg i A2852 line, the Mg ii 
A2796, 2803 doublet and the five strongest Fe ii transitions with q>0 (AAAAA2383, 2600, 
2344, 2586 and 2374). This yielded tentative evidence that a was smaller in the absorption 
clouds than in the laboratory, with Aa/a = (—1.09 it 0.36) x 10~^ (a 3cr detection). This 
result was dominated by the 14 systems above z^hs = for which Aa/a = (—1.88 it 
0.53) X 10"^ The Zabs < 1 systems yielded Aa/a = (-0.17 ± 0.39) x 10"^ which is 
consistent with no variation. 

The addition of more absorbers by Webb et al. (2001) and Murphy et al. (2001c) increased 
the significance of the detection, yielding Aa/a = (—0.72 it 0.18) x 10~^ (a 4cr detection). 
Importantly, these works made use of the Ni ii/Cr ii/Zn ii transitions, which display 
a significantly different relationship between A and q. The Fe ii/Mg ii subset of those 
works yielded Aa/a = (-0.70 ± 0.23) x 10"^, whilst the Ni ii/Cr ii/Zn ii subset yielded 
Aa/a = (—0.76 ± 0.28) x 10~^. The good concordance of these results despite the quite 
different relationship between A and q for these transitions suggests that the apparent 
result is not due to some simple systematic effect. 

A third quasar sample augmented the results above to produce a sample with 128 quasar 
absorbers (Webb et al., 2003; Murphy et al., 2003a, b). This analysis found Aa/a = 
(-0.57 ± 0.10) X 10"^. Murphy et al. (2004) added an additional 15 absorbers to find 

Aa/a = (-0.573 ±0.113) X 10~^ (4.8) 

which we think represents the previous best constraint on variation in a from quasar 
spectra. Murphy et al. (2001a, 2003b, 2004) considered systematic effects in detail and 
found that the results presented cannot be ascribed to any known systematic effect. 

Other groups have applied the MM method to search for variation in a. Unfortunately, 
almost all of these works present the results of single quasar absorbers. Without a statis- 
tical sample, it is difficult to ascertain whether the statistical uncertainties are a good 
reflection of the true uncertainty, in that one cannot compare a set of Aa/a results 
to check for over-dispersion about a fitted model. Additionally, most of these analy- 
ses focus on a single absorber at z ~ 1.151 toward the bright quasar HE 0515—4414. 
Quast et al. (2004) reported from an analysis of that absorber using a VLT/UVES spec- 
trum that Aa/a = (-0.04 ± 0.19 ± 0.27) x 10"^. Levshakov et al. (2005) also analysed 
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this absorber using a VLT/UVES spectrum to find that /S.a/a = (0.04 it 0.15) x 10~^. 
A later analysis by Chand ct al. (2006) reported Aa/a = (0.10 ± 0.22) x 10"^ from 
VLT/UVES observations, and Aa/a = (0.05 ± 0.24) x 10"^ from HARPS observations 
of the same absorber. Levshakov ct al. (2006) reported Aa/a = (—0.007 ± 0.084) x 10~^ 
from VLT/UVES observations of the same absorber. The other highly studied absorber 
is the z ~ 1.839 absorber toward Ql 101— 264, for which Levshakov ct al. (2005) reported 
Aa/a = (0.24 ±0.38) x 10-^ updated to Aa/a = (0.54 ±0.25) x 10^^ in Levshakov et al. 
(2007). 

Some of these works (e.g. Levshakov et al., 2007) state that they are using a method 
termed the Single Ion Differential a Method (SIDAM), which measures shifts between 
transitions of only one species (Fe ii). Whilst this does obviate the potential concern of 
spatial segregation between different species, it also affords a lack of sensitivity relative 
to the full MM method. SIDAM is just the multiplet analogy of the AD method, and is 
simply a restricted case of the MM method. In the ideal case, the use of SIDAM should 
cause no problems (other than a potential loss of sensitivity), but in practice there are 
three potential concerns, two of them related: 

1. Consider the differences in the q coefficients for the Fe II transitions used. All of 
the transitions display similar q ~ 1500 except the A1608 transition {q ~ —1300). 
Although the Fe li transitions other than A1608 do have small differences in q, the 
effect of this configuration is that almost all of any potential signal arises from the 
use of the A1608 transition. This can lead to two potential problems: 

(a) Because of the relative differences in the q coefficients, any error in the A1608 
laboratory wavelength or q coefficient will significantly affect the measured value 
of /S.a/a. 

(b) For the same reason, if the A1608 transition is contaminated in some fashion 
[either with interloping transitions (see section 4-4.4) or through other data 
problems, such as cosmic rays], the bias introduced into the value of Aa/a 
measured could be significant. 

The MM method helps to obviate both of these problems by reducing the 
effect of a problem in any one transition. For instance, if the SIDAM transi- 
tions were augmented with Mg ii, then the impact of any contamination of the 
Fe II A1608 transition would be reduced. Further reduction in the impact of 
any problems with the Fe ii A1608 transition would come from using additional 
MM transitions. 

2. The arrangement of q coefficients with wavelength when using SIDAM (see figure 
4.2) is such that q is strongly correlated with wavelength. This means that SIDAM 
is sensitive to long-range wavelength miscalibrations in a way which the MM method 
is not (providing that the full range of transitions can be used) . 
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It is for these reasons that we recommend the adoption of the fuh MM method — it confers 
significant resistance to problems such as those described here by making fuh use of the 
data, and yields a significant gain in sensitivity to Aa/a. 

The only other statistical sample is that from Chand et al. (2004), who reported Aa/a = 
(— 0.06ib0.06) from an analysis of 23 absorbers from VLT/UVES. The statistical precision 
stated is surprising, given the relatively small sample size compared to that of Murphy et al. 
(2004). The VLT/UVES spectra are generally of higher signal-to-noise, but not sufficiently 
so to explain the quoted statistical precision. Murphy et al. (2007c) and Murphy et al. 
(2008b) analysed the results of Chand et al. (2004) and showed that the stated precision is 
far in excess of the maximum theoretical precision allowed by the data. The same criticisms 
apply to Chand et al. (2006) and Levshakov et al. (2006). Murphy et al. (2008b) applied the 
models of Chand et al. (2004) to the same data (however with different flux error arrays) 
using VPFIT, and ultimately derived a more appropriate value of Aa/a = (—0.64 it 0.36) x 
10~^. However, Murphy et al. caution that the models used probably under-fit the spectra, 
and therefore this result should not be considered an optimal treatment of these absorbers. 
Murphy et al. (2008b) also commented on the optimisation algorithm used by Chand et al. 
(2004). In particular, curves of vs Aa/a given by Chand et al. (2004) display point- 
to-point fluctuations which are not substantially smaller than unity. This implies that 
the optimisation algorithm used by Chand et al. (2004) was terminating prematurely, and 
therefore that the actual values of Aa/a given in that work are meaningless. We comment 
further on the design of non-linear least squares optimisation algorithms and their points 
of failure in chapter 7, and demonstrate through the application of Markov Chain Monte 
Carlo methods that the parameter estimates and uncertainties produced by vpfit are 
robust. 

4-1.4 Objective of this chapter 

The purpose of this chapter is to use publicly available data from VLT/UVES to generate 
a MM sample similar in size to the Keck sample. If Aa/a = and we could achieve a 
statistical precision comparable to or better than the Keck sample, we might be able to 
demonstrate inconsistency of the Keck Aa/a values with the VLT Aa/a values at the 
~ 3.5a level. On the other hand, concordance of the VLT result with the Keck result 
would be a signiflcant step in verifying the work of Murphy ct al. (2004). We set out our 
analysis of the VLT data below, and compare our results with the Keck sample. 

4-2 Atomic data 

In table 4.1 we show the atomic data and g-coefficients which were used in our analysis. 
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Table 4.1: Atomic data for transitions usable in many-multiplet or alkali-doublet analyses, i.e. transitions with precise laboratory wavelengths. 
Information for isotopic and hyperfine components is given in italics. Columns 1 and 2 show the common names used for the transitions. 
Column 3 shows the mass number for each ionic species. The derivation of the laboratory wavenumbers, wq, is summarized by the value of X as 
follows: - Measured wavenumber; 1 - Inferred from measured component wavenumbers; 2 - Inferred from measured composite wavenumber 
and measured component splitting; 3 - Inferred from measured composite wavenumber and calculated component splitting. Column 6 gives the 
reference(s) for the wavenumber measurement and/or calculations (specified below the table). Vacuum laboratory wavelengths, Aq, are derived 
from the wavenumbers. Columns 8 and 9 show the lower and upper /excited state electronic configurations. The ID letters in column 10 offer 
a simple shorthand for labelling transitions used to fit absorption systems. Column 11 shows the ionization potential for the relevant ion, IP^, 
and for the ion with a unit lower charge, IP~. Column 12 shows the oscillator strengths, /, taken from Morton (2003) or the relative strengths 
of the hyperfine or isotopic components. The latter are taken from Rosman & Taylor (1998). The q coefficients and their uncertainties are from 
Dzuba ct al. (1999b, a, 2001, 2002) and Bcrcngut et al. (2004a, b). Note that uncertainties in the q coefficients are representative not statistical. 
Wavenumbers are on the Whaling et al. (1995) Aril calibration scale; the Fell A1608/1611 and Niii wavenumbers have been scaled from their 
original values to account for the calibration difference between the Am scales of Norlen (1973) and Whaling et al. (1995). The exceptions 
to this are the Mgi/ii wavenumbers which are on a highly accurate absolute scale generated using a frequency-comb calibration system. The 
Whaling et al. (1995) scale best agrees with this absolute scale. Table modified from a version compiled by Michael Murphy. 
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4-2.1 The q coefRcients and the effect of redshift 
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Figure 4.2: Relationship of q coefficients with rest wavelength. Points for different tran- 
sitions have been given different shapes for clarity. In line with Murphy (2002), we define 
transitions with q > 700 cm~^ as "positive shifters" and transitions with q < — 700cm~^ 
as "negative shifters". Similarly, we define transitions with \q\ < 300 cm^"*^ as "anchor 
transitions", and transitions with 300 < \q\ < 700 cm~^ as "mediocre shifters". We have 
not shown Al III here as we treat the Al iii transitions differently to previous analyses. 
For the low-z Fe ii/Mg ii combination, the q coefficients are strongly anti-correlated with 
wavelength, which makes this combination susceptible to low-order wavelength scale dis- 
tortions. However, the arrangement of the q coefficients with wavelength is much more 
complicated for the high redshift systems; this arrangement confers significant resistance 
to systematics. 



It is instructive to consider the relationship of the q coefficients with wavelength, and in 
turn with the redshifts of the absorbers. We show the relationship of the q coefficients 
with rest wavelength in figure 4.2. 

At low redshifts {z < 1-3), absorbers predominantly consist of the Fe il/Mg ii combination 
(with Mg I sometimes included). This combination gives a reasonable sensitivity to Aa/a, 
with Aq 1500. However, it is worth noting that the arrangement of the q coefficients 
is such that q is significantly anticorrelated with wavelength. This means that any effect 
which stretches or compresses the spectrum will mimic variation in a. Similarly if the 
spectral data regions which contain Mg and Fe are obtained at different time periods, and 
there is a wavelength calibration offset between the spectra taken at different times, then 
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a spurious value of /S.a/a will emerge. The latter circumstance is possible for the earlier 
Keck data, as all the data were obtained when HIRES only had a single CCD chip; multiple 
exposures were required to obtain full coverage of the optical region. Differences in quasar 
slit centering between the exposures could produce significant wavelength miscalibrations 
between the exposures. 

For absorbers of moderate redshift (1.3 z < 2.2), more transitions become useful. 
For high column density systems, Zn II and Cr II may be observed. The Zn ll/Cr II 
combination is extremely important: the Zn II transitions display strongly positive q and 
the Cr II transitions display strongly negative q. The difference between them provides 
Ag ~ 2900 if only the Zn II A2062 transition is used from the Zn II pair, and ~ 3800 
if the Zn II A2026 transition can be used. Additionally, the Zn II and Cr II transitions 
are interleaved with each other, with different transitions shifting in different directions. 
This produces a unique signature if a varies which is difficult to mimic through systematic 
effects. If Zn ll and Cr ll are observed then Ni ll is also generally seen. In some cases, the 
Si II AA1526, 1808 and Al ll A1670 transitions are available. If these are fitted together 
with Fe II and Mg II then the anticorrelation of q with wavelength disappears, creating a 
combination which is much more resistant to systematics. At these redshifts, Fe II A1608 
also appears for high column density systems. This provides a Ag of ~ 2500 between 
the Fe ii transitions. Above redshifts of ~ 2, it becomes difficult to use Mg II because it 
either falls in regions affected by sky emission or absorption, or because the transitions 
are located out of the red end of the spectral coverage. 

For high redshifts {z > 2.2), the Mg transitions are no longer useful. Instead, the pre- 
dominant combination is some combination of Si II, Al II and the Fe II transitions. At 
even higher redshifts the positive-g Fe II transitions also start to use utility, leaving Fe II 
A1608 and A1611 as the only useful Fe II transitions. For lower column density systems, 
the Si ll/Al ll/Fe ll A1608 combination is prevalent, whereas for higher column density 
systems we see these transitions and the Cr ll/Zn ll/Ni II combination. 

Mn II is seen at low to high redshifts, but the Mn ii transitions have q ~ 1000 and are 
situated at wavelengths between the Fe ll positive-g transitions and the Mg transitions. 
Thus, they add statistical sensitivity but do not help break the anticorrelation of q and 
wavelength if used only in combination with the positive-g Fe II transitions and the Mg II 
transitions. 

Ti II is only seen in high-column density systems at low redshifts, of which we have few 
in our sample. For z > 1.5 the Ti II transitions are strongly affected by sky emission/ab- 
sorption and so arc difficult to use; for -z > 2 they generally fall out of the red end of 
the spectral coverage. The easiest way to search for Ti ii is to search near the redshifts 
of DLAs, but DLAs can only be quickly identified from the ground for z > 1.5, where 
the A1216 H l absorption falls above the atmospheric UV cutoff at ~ 3000A. Effectively, 
the long rest-wavelength of Ti II relative to the other transitions used is why it does not 
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feature prominently in our analysis. It is worth noting, however, that Juliet Pickering and 
Matthew RufFoni at Imperial College have recently measured the wavelengths of the Ti il 
AA1910.60, 1910.94 lines to the requisite precision for use in a MM analysis (J. Webb, priv. 
communication). The oscillator strength of the A1610.60 line (/ ~ 0.20) is comparable to 
the second strongest Ti il line in the existing MM set, the A3243 line (/ ^ 0.18). The 
A1910.60 and A1910.94 lines have q coefficients of —1564 and —1783 respectively (Berengut 
et al., 2004a). These new measurements will help increase the number of absorbers which 
are fitted with transitions having negative q, which will both increase sensitivity to Aa/a 
and help constrain systematics. 

4-3 Spectral data 

Our spectral data are drawn from the archive of UVES, on the VLT. A collaboration 
of researchers, coordinated by Michael Murphy, has attempted to reduce all the publicly 
available quasar observations which might be used for determining Aa / a into wavelength- 
calibrated, cleaned ID normalised spectra. We are grateful for the efforts of the fol- 
lowing people: Matthew Bainbridge, Ruth Buning, Huw Campbell, Robert Carswell, 
Ankur Chaudhary, Glenn Kacprzak, Ronan McSwiney, Helene Menager, Daniel Mount- 
ford, Michael Murphy, Jon Ouellet, Tang Wei, Berkeley Zych. 

Data were converted from 2D echelle spectra to ID form using the midas pipeline, provided 
by ESO. To calibrate the wavelength scale of the science exposures, the midas routine uses 
a thorium-argon (ThAr) list combined with a particular ThAr exposure. Unfortunately, 
the default algorithm and line list used by the midas pipeline is suboptimal for the reasons 
set out in section 3-3.2. As for the analysis of chapter 3, all our spectra are calibrated 
with the algorithm and line list of Murphy et al. (2007a). The program uves_POPLER^, 
by Michael Murphy, was used to combine multiple exposures into a single, ID, normalised 

spectrum. UVES POPLER was specifically written for this purpose. All this work was done 

by the aforementioned people. 

4-4 Methods &; methodology 
4-4.1 Instrumental profile 

The exposures for most absorbers were taken over many nights, often by different ob- 
servers under significantly varying observing conditions. In these circumstances, defining 
an instrumental profile is difficult. In all cases we assumed a Gaussian instrumental pro- 
file with a velocity FWHM of 6 km/s. Although this choice may cause inaccuracies for 

^Available at http://astronomy.swin.edu.au/-minurphy/UVES_popler. 
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any given absorber, particularly in the choice of the number of velocity components (see 
section 4-4.2 below), because the nature of the error made will be random from absorber 
to absorber it will average out over a sufficiently large ensemble of absorbers. 



4-4.2 Modelling the velocity structure 



Although a few absorbers can be well modelled by a single Voigt profile (such as the 
absorber shown in figure 4.3), most absorbers display complicated structure. The structure 
arises as a result of different clouds of gas located along the line of sight, separated by non- 
cosmological distances; we noted in section 1-7 that the absorbers are likely to be associated 
with galaxy disks and halos. The typical velocity separation of the different components of 
the absorption is typically tens to a few hundreds of kilometres per second, which is typical 
of the velocity of galaxy rotation curves. In general, the observed absorption profile can 
usually be adequately modelled by adding Voigt components until a statistically acceptable 
fit is achieved. 
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Figure 4.3: Our MM fit to the absorber at Zabs = 1-018 toward J220852-1934359, which 
is apparently well fitted by a single component. The horizontal scale indicates the velocity 
difference from the arbitrary redshift stated at the bottom for the given data points. The 
black line indicates the observed normalised flux, with the green line indicating our best 
fit solution. At the top of each box, the black line indicates the standardised residuals 
(that is, [data - model] /error), with the red lines indicating itlo". The position of the blue 
tick marks indicates the fitted position of the single component 
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The process of building up the Voigt profile model usually requires synthesising the infor- 
mation available from the observable transitions of different atomic species. Because the 
minimisation process is non-linear, at each stage of building the Voigt profile model one 
must supply initial guesses for the parameters. Choosing poor parameter guesses can cause 
convergence to physically implausible models, or non-convergence. Therefore the process 
of building the model starts from regions which contain the most information available to 
constrain the parameters of the model (that is, regions where the structure of the model is 
most clearly visible "by eye"). The ideal region has transitions of high optical depth, but 
which are not saturated. Regions of spectra where the optical depth is very low or very 
high do not show the velocity structure of the absorber clearly, and therefore contribute 
weaker constraints on both the velocity structure and Aa/a. The model building process 
thus starts with regions where the velocity structure is relatively clear, and then proceeds 
to regions with less information to constrain the parameter set. Therefore, the process of 
building up the Voigt profile model proceeds as follows: 

1. The fitting starts with the strongest unsaturated transition. For low redshift systems, 
this is typically the Mg ii A2796 or A2803 transition, although for the high column 
density systems at low redshift this may be the Fe ii A2383 or A2600 transition. For 
high redshift systems, this may be any of the Al ii A1670, Si ii A1526 or Fe ii A1608 
transitions. For intermediate redshift systems, a wide variety of transitions are often 
visible, and any of the Fe ii, Mg ii, Al ii or Si ii transitions were generally used. 
Transitions which were clearly affected by sky absorption were not used at this stage. 
Similarly, the initial fitting was generally not done with the Mg ii transitions where 
they fall at A > SOOOA due to the possibility of contamination with sky absorption 
or emission. 

2. For the initial transition selected, Voigt components were added until a statisti- 
cally acceptable fit was achieved (see section 2-1.2 for the definition of "statistically 
acceptable"). 

3. This model was then applied to other transitions of the same atomic species, if these 
transitions were available. This is almost always possible for Fe ii, for which the 
AAAA2383, 2600, 2344 and 2586 transitions are often observed simultaneously. The 
Voigt profile model was then refined, adding or removing transitions as necessary. 
By applying the model to transitions from the same atomic species (i.e. same ground 
state), one is guaranteed that the same model must be valid for all the transitions. 
Deviation of the data from the model in one transition without a corresponding 
deviation in the other transition therefore is a likely signal of problems with the 
data or contamination by other species. 

4. The deviation of the data from the model just described is generally due either to 
cosmic rays (which cause excess flux), absorption by an interloping species, or sky 
emission or absorption. Where evidence exists for pixels affected by cosmic rays. 
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we clip out the affected pixels so that they do not contribute to the calculation 
of x^- We explain the treatment of interlopers below in section 4-4.4. The midas 
pipeline attempts to subtract sky emission as part of the spectral extraction process. 
However, we have noticed that where sky emission is strong, the extraction appears 
to be imperfect, leaving sharp spikes and dips in the spectrum. Where we find 
evidence for such artifacts, we clip out the affected pixels. Where the spectra are 
affected by absorption from sky lines, we either clip out the affected pixels or do not 
use the affected transition. 

5. We then applied the model from the single atomic species to other observable species. 
We generally first applied the model to transitions of lower optical depth. This is 
because the fact that the transitions are not saturated generally allows rapid con- 
vergence of the applied model to a good fit. Because the line centres are generally 
identifiable in the data, the parameters and b for each line are relatively uncor- 
related with z, which makes convergence more likely to occur, and more rapid. We 
then adjusted the model, adding extra components where this decreased the AICC. 

(a) In some cases, weak transitions were rejected by vpfit, because their column 
density was driven below a user-adjustable cutoff (by default < lO^cm"^). 
This simply means that the model is statistically preferred without these com- 
ponents. There is no way to force the inclusion of these components in a sta- 
tistically justifiable sense. There may be some bias introduced into the value 
of Aa/a, but given that the components are weak this bias should be small. 
Moreover, because this effect should equally bias Aa/a in a positive direction 
as much as in a negative direction between different absorbers, it will average 
out over many absorbers if it does exist. Murphy (2002) investigated the effect 
of fixing these dropped components at the column density immediately before 
they were rejected, and found that "[n]o cases were found where the values of 
Aa/a from the different runs differed significantly". 

6. We then applied the model to the transitions of higher optical depth. Where sat- 
uration is present, there is a relative degeneracy between A^, b and z (although 
constraints on b and z come from other transitions), and therefore we generally had 
to be more cautious about our initial guesses for the values for N for each component 
of the model for those transitions. In general, we would manually adjust the column 
densities for the different components to obtain a reasonable starting fit by-eye, and 
then allow vpfit to minimise x^- 

7. For the transitions fitted in point 6, it is sometimes necessary to add more com- 
ponents in the optically thin regions of the profile. Because the optical depths for 
the transitions in point 6 were higher than those in point 3, this means that these 
components were not required in the regions in point 3. These components were 
then introduced to the other regions in an attempt to see whether they could be 
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retained. In some regions these components were retained, but, as in point 5(a), 
these components were sometimes rejected in the weaker transitions. 

An of this process is guided by the model selection criteria set out in section 2-1.2. 

We show an example of a complicated Voigt profile fit in figure 4.4 to illustrate the process 
described above. 

At any stage of the above process, evidence may emerge for contamination by cosmic 
rays or interlopers. The decision as to whether such contamination exists is effectively 
based on consistency between the models used for and observed flux data in different 
spectral regions. For instance, with only two transitions (and no other information), one 
cannot determine whether contamination is present in a particular transitions. However, 
with three transitions (or with the presence of additional information), the concordance 
between two transitions allows one to infer that the third transition is problematic. As the 
model is progressively constructed, the confidence with which one can infer the presence 
of an interloper increases — concordance in corresponding regions of the model between 
many transitions and significant excess absorption in the corresponding region in another 
transition constitutes strong evidence for an interloper. 

Of course, in some cases the interloping transition can be identified on account of other 
transitions from the same atomic species. This is particularly true for the C iv and S iv 
doublets. Nevertheless, the process of building up a good Voigt profile model depends on 
the synthesis of information from all the transitions present. 

4-4.2.1 Gravitational lenses 

We discovered a small number of absorbers for which the velocity structure appeared to be 
the same for transitions arising from the same ground state, but where the line intensities 
differed substantially. A particular example of this is the z = 0.82 absorber along the 
line of sight to J081331+254503. Further investigation demonstrated that this quasar is 
known to be gravitationally lensed. The complex line-of-sight geometry causes the effect 
described. We discarded any system for which this problem appeared and for which we 
could identify the quasar as a known gravitationally lensed system. 

4-4.3 Random and systematic errors 

Unfortunately, the Voigt profile decomposition is not unique. Errors in modelling the 
velocity structure may impact the fitted value of Aq/q. We thus distinguish between 
three different types of errors which may affect Aa/a: 
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Figure 4.4: Part of our MM fit to the Zabs = 2.152 absorber toward J233446-090812. 
Tliis is a complex absorption system, requiring many components in order to acliieve a 
statistically acceptable {Xu ~ ^) ^^^^ horizontal scale indicates the velocity difference 
from the redshift stated at the bottom for the given data points. The black line indi- 
cates the observed normalised flux, with the green line indicating our best fit solution. 
At the top of each box, the black line indicates the standardised residuals (that is, [data 
- model] /error) , with the red lines indicating itlo". The position of the blue tick marks 
indicates the fitted position of each component. We draw the reader's attention to the 
presence of a wide range of transitions, some with relatively small magnitude q coeffi- 
cients (Nill A1709 and the two Mgll transitions), some with large magnitude, positive q 
coefficients (Fell AAA2382, 2344, 2374, 2260) and some with large magnitude, negative q 
coefficients (Fell A1608, Niii A1741,1751). Also included in the fit, but not shown, are 
the transitions Cm A2056, 2062, 2066, Znii A2026,2062 and Mnii A2576. Note that for 
the stronger species, such as the Fell AAA2382 and Mgii transitions, the centre regions of 
the profile are saturated, and thus a constraint on Aa/a only comes from the optically 
thin wings. Conversely, for the weaker species (for example, Fe li A1608, 2260 and the Ni ii 
transitions) most or all of the profile is optically thin, and thus a constraint on Aa/a is 
derived across the whole profile. Importantly, a single velocity structure model provides 
a good model to all the observed MM transitions. This serves to validate an assumption 
underlying the MM method, namely that spatial segregation of the different species — if 
present — must be relatively small. We note the presence of two weak interlopers in Fe ii 
A2260, and one in Fell A1608, yielding an additional three tick marks. 
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1. Statistical errors. These errors are simply the errors on Aa/a which derive from 
the propagation of uncertainty from the flux error array via the Voigt profile model. 
These errors are the errors produced by vpfit from the covariance matrix at the 
best-fitting solution. 

2. Random errors. Random errors are any effects which might cause Aa/a to be mea- 
sured inaccurately when considering a single absorber. Significant errors made in 
determining the correct velocity structure could cause an error of this type. Murphy 
et al. (2008b) demonstrated that an under-fitted spectrum (i.e one with a deficient 
model for the velocity structure) gives erroneous values of Aa/a. Other poten- 
tial causes of random errors include: i) Spatial segregation of different elements 
(which cannot be preferentially biased along the radial sightline over a large num- 
ber of absorbers); ii) random blends with other transitions; Hi) random departures 
of the wavelength calibration solution from the true wavelength scale; iv) cosmic 
rays and other uncleaned data glitches; v) incorrect determination of the broaden- 
ing mechanism (turbulent or thermal) for any component. Importantly, the effect 
of random errors will average to zero when considering an ensemble of absorbers. 
This is because these errors will displace Aa/a to be more positive as often as they 
will displace it to be more negative. When considering only a single absorber, this 
type of effect must be considered a systematic error. However, when considering an 
ensemble of absorbers, this impact of this type of effect is random, and merely adds 
extra scatter into the data. 

3. Systematic errors. This is any error which systematically affects the value of Aa/a. 
Such effects would include: i) inaccuracies in the laboratory wavelengths; ii) a differ- 
ent heavy isotope abundance for Mg in the clouds relative to terrestrial values; Hi) 
systematic blends with other lines; iv) time-invariant differential light paths through 
the telescope for different wavelengths; v) atmospheric dispersion for spectra taken 
without an image rotator; vi) differential isotopic saturation, and; vii) wavelength 
miscalibration due to thorium-argon line list inaccuracies. Murphy et al. (2003a) 
considered many potential systematic effects in detail. Note that these effects will 
not necessarily produce the same spurious shift in Aa/a in every absorber. For 
example, if some transitions have inaccurate laboratory wavelengths, the effect on 
Aa/a will depend on which other transitions are fitted in the same absorber. Nev- 
ertheless, the above effects are considered systematic errors because they will cause 
similar or correlated shifts in certain subsets of absorbers. 

There is a crucial distinction between these effects: some effects may be considered sys- 
tematics in single absorbers, but are not systematics in an ensemble of absorbers. For this 
reason, we are cautious against placing too much emphasis on the interpretation of the 
value of Aa/a from any individual absorber. We demonstrate later that the impact of 
random effects is non- negligible. We explain our treatment of random effects in section 
4-4.8.4. 
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4-4.4 Interlopers 

Some transitions display excess absorption beyond what is predicted from other transitions 
from the same atomic species. This excess absorption is caused by another absorber 
located along the line of sight to the quasar, which is usually extragalactic. Even where a 
prediction cannot be made from another transition of the same atomic species, interlopers 
can still be detected when many transitions are fitted together, as the redshifts and b- 
parameters of each component are constrained. Although the contaminated sections of 
spectrum can be discarded, it is often possible to adequately model the contamination, 
thereby maximising use of the spectral data. 

We distinguish two types of interlopers: identifiable and unidentifiable. 
4-4.4.1 Identifiable interlopers 

In some cases, the interlopers can be identified and modelled simultaneously with the MM 
transitions. By "identified" we mean that the redshift, atomic species and wavelength of 
the transition which causes the excess absorption can be determined. In principle, one 
needs accurate rest wavelengths and q coefficients for the interloping transitions. This 
means that the interloping transition can be modelled if it is an MM transition from an 
absorber at a different redshift, or if it is Siiv AA1393 or 1402. If the interloper is from 
the Civ doublet, the contamination can also be modelled despite the fact that the rest 
wavelengths for this doublet are relatively poorly known. This is done by allowing the 
Civ transitions to have a separate value of Aa/a, which is then discarded. This extra 
parameter effectively absorbs any error introduced through inaccurate knowledge of the 
rest wavelengths. 

4-4.4.2 Unidentifiable interlopers 

In many cases, however, the interloping transition can not be identified. In this case, our 
decision as to how to proceed depends on the degree of contamination. If the degree of 
contamination is small, and confined to a small area of the observed profile, we can include 
unknown interloping transitions where the residuals of the fit ([data - model] /error) are bad 
until a statistically acceptable fit is achieved. Doing this provides a statistically acceptable 
model of the contamination. Note that the contribution to Aa/a of the affected MM 
transition will be reduced as a result of this, as the interlopers included are unconstrained 
by other spectral regions. We show an example of this in figure 4.5. 

As the degree of contamination begins to increase, the potential error introduced into 
Aa/a may grow. Our treatment of transitions affected by significant contamination de- 
pends on whether there are other transitions from the same species available which can 
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Figure 4.5: The Si ii transitions from our MM fit to the absorber at z^bs = 2.638 toward 
J212912— 153848, shown without an included interloper. The horizontal scale indicates 
the velocity difference from the arbitrary redshift stated at the bottom for the given 
data points. The black line indicates the observed normalised flux, with the green line 
indicating our best fit solution. At the top of each box, the black line indicates the 
standardised residuals (that is, [data - model] /error), with the red lines indicating itlo". 
The position of the blue tick marks indicates the fitted position of the single component. 
The strength of the Si ii A1808 transition can be predicted from the model for Si ii A1526. 
Sin A1808 shows excess absorption at f ~ — 40kms~^ which cannot be explained by Sin 
A1526. To account for this, we include a single, unconstrained interloper. After including 
the interloper, the fit is statistically acceptable. Our fit including the interloper may be 
found in figure E.120. 



be used to constrain the velocity structure for that transition. In the case of Fe II, a wide 
variety of transitions are often available. Effectively, a fit to several other transitions of the 
same species may allow the structure of the contamination to be determined, particularly 
if the SNR is high. However, in some cases, there may be no other transitions which can 
be used to obtain the velocity structure. This may occur as a result of all of the transitions 
suffering from contamination from different absorbers, or because the spectra only includes 
one of the transitions of that species (due to gaps in the spectral coverage), or because 
the species has no other transitions which could be used for that purpose. The last case is 
particularly problematic for Al ii, for which we only use the Al ii A1670 transition — there 
are no other Al ii transitions which can be used to directly constrain the Al ii structure. 
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J005758-264314 z=1.268 
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Figure 4.6: Part of our MM fit to the absorber at Zabs = 1-268 toward J005758-264314. 
The horizontal scale indicates the velocity difference from the arbitrary redshift stated at 
the bottom for the given data points. The black line indicates the observed normalised 
flux, with the green line indicating our best fit solution. At the top of each box, the 
black line indicates the standardised residuals (that is, [data - model] /error), with the red 
lines indicating it la. The position of the blue tick marks indicates the fitted position 
of the single component. We show here the model from Mgii A2803 plotted over the 
spectral region for Mgii A2796. One can see that there is significant excess absorption 
in Mgll A2796 for v < — 40kms~^. There also appears to be excess absorption in the 
range — 40kms~^ ^ v ^ — 20kms^^. Due to the wide range in velocity over which the 
absorption occurs, instead of attempting to model the absorption we clip away all pixels 
for V < -20kms"^ in the Mgii A2796 region for our actual fit (which may be found in 
figure E.16). 

This issue also occurs for Si ii — although in theory both the Si ii AA1526, 1808 transitions 
can be used to constrain the velocity structure of Si II, the oscillator strength for the A1526 
transition (/ « 0.13) is much larger than for the A1808 transition (/ ~ 0.002). For many 
systems observed to have Si ii A1526 absorption, the column density of Si ii is not large 
enough to detect the A1808 transition. Even if the A1808 transition is detected, it may be 
too weak to provide a meaningful constraint on the Si ii structure. 

In cases where we are unable to obtain a good constraint on the velocity structure of a 
particular species from other unaffected transitions of the same species, and the degree 
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of contamination is not small, we are very cautious about inserting interlopers, due to 
the potential bias this could introduce into Aa/a for this absorber. Note that any bias 
introduced here is a random effect, and therefore will average out when considering an 
ensemble of absorbers. Nevertheless, we wish to avoid introducing extra scatter into the 
Aa/a values where possible. In these cases, we clip out the pixels which appear to be 
affected by contamination, leaving a wide buffer on either side. Note that we can only 
do this where another transition from the same species exists. Otherwise, components 
situated in the middle of the clipped pixels might have very little, if any spectral data to 
constrain them, and thus their column densities could take on values which would not be 
consistent with the general model used for the absorber. If there is no other transition 
for the species in this case, we simply do not use the transition. In figure 4.6 we show an 
example of where we have clipped out pixels because of contamination by an interloper. 

Fitting contamination has the potential to introduce a random bias for individual ab- 
sorbers, however this may nevertheless reduce systematic effects. To see this consider a 
hypothetical absorber where the Mg ii AA2796, 2803, Al ii A1670 and the Fe ii AAA2383, 
2600, 2344 transitions are available, but where the Al ii A1670 transition suffers from some 
minor contamination in part of the observed profile, and the absorber of the contaminat- 
ing transition cannot be identified. One could simply ignore the Al ii A1670 transition 
and fit the Mg ii and Fe ii transitions. Deriving Aa/a from just the Mg ii/Fe ii com- 
bination is not very robust to a simple stretching or compression of the wavelength scale 
(Murphy et al., 2003a). The Mg ii/Fe il/Al li combination just described is much more 
robust against this effect, as the Mg ii and Al ii anchor transitions are positioned on other 
side of the high-g Fe ii transitions. Modelling the contamination of the Al ii profile here 
may introduce a random error, but this can be averaged out in the context of many ab- 
sorbers. The potential introduction of this random error is easily justified by the increased 
resistance to systematic effects (such as wavelength scale distortions, which in principle 
could be common to many absorbers or spectra). In the situation just described, the only 
choices are to discard the Al ii transition, or to model the contamination. We choose 
the latter option where the degree of contamination is minor, and the former where the 
contamination is severe. 

4-4.4.3 Transitions in the Lyman-a forest 

It also happens that some transitions fall in the Lyman-a forest ("the forest"), a dense 
series of absorption lines blueward of the quasar Lyman-a emission line. These transitions 
are caused by H labsorption along the line of sight to the quasar. We are cautious about 
using MM transitions which fall in the forest, due to the uncertainties in determining 
the structure of the forest. Nevertheless, the use of MM transitions in the forest may 
afford significantly better constraints on Aa/a. Although this often occurs with low-^; 
Mg ii/Fe II absorbers, with some Fe ii transitions falling in the forest, it also occurs in 
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high-z systems, where the Si II A1526/A1 II A1670/Fe II A1608 combination is common. 
Where the SNR is high for the transitions which fah in the forest, we model the forest 
structure with H labsorption. If the SNR ratio is low, determination of the forest structure 
can be difficult, and therefore we do not utilise the contaminated transitions. Again, we 
emphasise that although this has the potential to introduce bias for a single absorber, 
because the contamination is random from absorber to absorber, it must average out over 
a large number of systems. We have used transitions which fall in the forest in 27 of the 
1142 spectral fitting regions in the VLT sample (2.4 percent). 

4-4.5 Cr II A2062, Zn II and Mg I A2026 

The Cr ii A2062 (2062.24A) and Zn ii A2062 (2062. 66A) Hues are relatively closely spaced, 
being separated by ~ 62kms~^. For narrow absorption systems, one can distinguish 
between these transitions as they do not overlap. However, these transitions are most 
commonly associated with DLAs, where the velocity structure is generally complicated, 
and the system displays absorption over tens to hundreds of km/s. In this case, the 
Cr II A2062 and Zn ii A2062 transitions often overlap. The velocity structure for these 
transitions can be determined by simultaneously modelling these transitions with the Cr ii 
A2052, 2056, 2066 and Zn ii A2062 transitions. 

There is one point of caution here. For high column density systems, a potential blend 
exists with Mg I A2026. Mg i A2026 is weak, with oscillator strength / = 0.113, and is 
rarely seen. One can in principle use the Mg i A2852 (/ = 1.83) information to constrain 
the Mg I structure. In this case, a joint fit of Zn ii A2026 and Mg i AA2026, 2852 will ensure 
that the Zn ii A2026 results are not biased by any absorption due to Mg i A2026. However, 
the absorbers for which Mg i A2026 might be detected are often at high redshift, in which 
case Mg i A2852 is often unusable, either due to heavy contamination by sky emission or 
absorption, or because it is located out of the red end of the spectral coverage. In this 
particular circumstance, we are generally cautious about fitting Zn ii A2026. Where we 
consider that the Zn ii A2026 transition might be affected by Mg i A2026, and we are 
unable to utilise Mg i A2852, we do not include the Zn ii A2026 transition. 

4-4.6 Physical constraints 

As in the previous Keck analyses, we required that the 6-parameter of each modelled 
component for a particular species in the fit is related to the corresponding components for 
other species. The two extreme cases are wholly thermal broadening and wholly turbulent 
broadening. In general, there will be contributions from the two mechanisms (6^ = ^therm"'" 
b'^^j.^), however we have found that most systems are generally well- fitted with turbulent 
broadening. As noted in section 2-1.1 (and by Murphy, 2002) it is possible to explicitly 
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determine the degree of thermal and turbulent broadening, however in this circumstance 
the 6-parameters are generally poorly determined, which makes the optimisation difficult. 

It turns out that the turbulent fit is preferred on the basis of the AICC in 71 percent 
of the absorbers, and the thermal fit in 29 percent of the absorbers. However, it should 
be noted that the fits were initially constructed with turbulent broadening and then con- 
verted to thermal broadening. It may be that if the fits were constructed thermally and 
then converted to turbulent that these figures might change significantly. We emphasise 
that mistakes made in choosing turbulent or thermal fitting may bias Aq/q: for a single 
absorber, but these efi'ects must average to zero over a large number of absorbers due to 
the random nature of the bias from absorber to absorber. 

The previous analyses of the Keck results required that Aa/a calculated using both ther- 
mal and turbulent fits differed by no more than la for that absorber to be included in 
their ensemble, where the difference is considered only in terms of the statistical error. 
However, the generally higher SNR of the VLT data (compared to the Keck data) often 
leads to very precise statistical bounds on Aa/a. This makes the Icr-difference criterion 
difficult to fulfil in a significant number of cases. We describe below how we resolve any 
potential inconsistency between Aa/a values from the thermal and turbulent fits. 

In determining how to resolve any potential inconsistency, there are three cases to consider. 
i) Where the difference between the fits is substantial, as measured by the AICC, one 
wants to take the statistically preferred fit. ii) Where the quality of the fits is similar 
(AlCCturbuient ~ AlCCthermai)) and the valucs of Aa/a are the same, then it does not 
matter which fit is used, in) If the values of the AICC for the thermal and turbulent fits 
are similar, but the values of Aa/a produced by those fits differ significantly, then the 
statistical precision accorded to Aa/a should be reduced to account for the confiicting 
evidence, and value of Aa/a should be somewhere between the two cases. 

To resolve this problem, we use a method-of-moments estimator which takes into account 
the relative differences in the AICC and the agreement, or otherwise, of the values of 
Aa/a. We estimate the underlying probability distribution of Aa/a for the absorber in 
question as the weighted sum of two Gaussian distributions (one for the thermal result, 
one for the turbulent), with centroids given by the best fit value of Aa/a for each fit, and 
cr equal to a Aa/a for each fit. We weight the sum by the penalised likelihood of the fits. 
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via the AICC (see Liddle, 2007). That is, if 

k = exp(-AICCturbulent/2) + exp(-AICCthormal/2) , (4.9) 

jl = exp(-AICCturbulcnt/2)/fc, (4.10) 

J2 = exp(-AICCthcrmal/2)/fc, (4.11) 

01 = Aa/Oturbulcnt, (4-12) 

02 = Aa/Othcrmal, (4-13) 

51 = fT(Aa/aturbuient), and (4.14) 

52 = fT(Aa;/a;thormal) (4.15) 

then matching the first two moments of our weighted sum of distributions with a Gaussian 
yields 

m = Aa/a = jioi +^'202, and (4-16) 



<^Aa/a = yji4 + h^i + iiaf + hai - (4.17) 

This covers ah the cases described above. In particular, where the AICC is similar but 
Aa/a differs significantly between the turbulent and thermal fits, the estimated error 
increases with the difference between them, providing resistance to incorrectly determining 
the line broadening mechanism. To see this, note that with ji + j2 = 1, equation 4.17 
reduces to 

(^Aa/a = ^hsl + (1 - ii)s2 " 3i){ai - 02)^ (4.18) 

Thus, errors only ever increase from our smallest error estimate, and therefore this method 
could be considered conservative. In the event where one broadening mechanism is signif- 
icantly preferred, then our result will be effectively the same as if only that broadening 
mechanism was considered. For the case where the fits are statistically indistinguishable 
(ji = ^2)) Aa/a is given by the simple mean of the two values of Aa/a, and the variance 
is the simple mean of the individual variances plus 0.25(ai — 02)^. 



4-4.6.1 Al III 



In principle the Al III transitions can be included in a MM fit, however its ionisation 
potential is somewhat different to the other MM transitions described. Due to variations 
in the incident radiation field, the Al iii transitions may therefore not arise from the same 
location, and therefore velocity, as the other MM transitions. If the Al iii transitions arise 
at significantly different velocities to the other MM transitions then an error would be 
introduced into Aa/a for a system with Al iii included (although this effect must average 
to zero over a large number of absorbers, as there is no reason for a systematic bias in the 
centroid of the Al iii transitions with respect to the other MM transitions along a line of 
sight to Earth). 
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Generally, the profiles for different transitions for the other MM transitions used correlate 
well with each other. By this, we mean that the relative column densities between corre- 
sponding velocity components are similar for different MM transitions. However, we have 
noticed that the absorption profiles for some Al iii transitions in some absorbers differ 
significantly in the relative line strengths between components, when compared to other 
MM transitions. Importantly, we found some absorbers where it was difficult to apply the 
same velocity structure model to Al iii transitions and the other MM transitions simulta- 
neously. For this reason, we are therefore cautious in fitting Al iii together with the other 
MM transitions. 

Therefore, we include and model Al iii if both the transitions are available, and allow 
the spectral data to contribute to Aa/a derived from the other MM transitions for that 
absorber, but do not constrain the modelled structure with the velocity structure from 
other MM transitions. Given the small difference in the q coefficients between the two 
Al in transitions {Aq ~ 250) the statistical contribution of Al iii to Aa/a is low, however 
given that the exposures have already been obtained it is prudent to try to maximise our 
use of the existing data. 

As an example: if Al iii is not utilised in the z = 1.857 absorber towards J013105— 213446, 
the turbulent fit value of Aa/a changes from (0.30 ± 1.43) x 10"^ to (0.43 ± 1.44) x 10"^ 
Similarly, if Al III is not utilised in the z = 1.71 absorber towards J014333— 391700, the 
turbulent fit value of Aa/a changes from (-2.20 ± 2.67) x 10"^ to (-2.32 ± 2.68) x 10"^ 

We note that previous works have included the Al iii transitions as part of the MM 
analysis, and it was demonstrated by (Murphy et al., 2003a) that the inclusion of these 
transitions did not significantly alter the Keck results. Nevertheless, the approach we have 
adopted is conservative. 

We have also observed less substantial relative line strength differences between the Mg i 
transitions and other MM transitions, but in no case did we find a system where we could 
not apply the same velocity structure model to the Mg i and MM transitions, and so we 
include the Mg i transitions in the full MM analysis. 

4-4.7 Aggregation of Aa/a values from many absorbers 
4-4.7.1 Weighted mean 

If one assumes that all the Aa/a values are described by a constant offset from the 
laboratory values, one can combine the Aa/a values together using a weighted mean. 
This process is valid provided that the Aa/a values support a constant value of Aa/a. 
If Aa/a 7^ 0, then this implies that there must be a transition at some point from the 
present day {Aa/a = 0), and therefore the Aa/a values should be inspected to see if a 
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transition point can be identified. A further imphcation is that one must examine the 
residuals about the fit for a weighted mean, plotted against various parameters of interest 
(primarily redshift and sky position) to determine if unmodelled trends exist. 

4-4.7.2 Dipole fit 

A dipole +monopole model constitutes the first two terms of the spherical harmonic ex- 
pansion. The simplest dipole model is of the form 



where @ is the angle between the pole of the dipole and the sky position under consid- 
eration. A is an angular amplitude and m (the monopole) represents a possible offset of 
Aa/a from the laboratory value. An equivalent (and more computationally convenient) 
form is 



where x is a unit vector pointing towards the direction under consideration and c contains 
the amplitude and direction information of the dipole. The components of c, (c^, Cy and 
Cz) are easily related to the right ascension (RA) and declination (dec) of the direction of 
the dipole. |c| gives the magnitude of the dipole. In this form, Aa/a is linear in the Cj 
and so the q can be determined through weighted linear least squares. 

Although naively we might expect that m = 0, some theories contemplate otherwise. This 
could be possible if Aa/a depends on the local gravitational potential or density (Khoury 
k Weltman, 2004; Mota & Shaw, 2007; Olive & Pospelov, 2008) — laboratory conditions 
differ quite significantly in this regard to the conditions in the quasar absorbers. Note 
that by including the m term one obtains an explicit test of this idea. Additionally, in 
the presence of temporal evolution of a, m amounts to the average effect of temporal 
evolution. In any particular redshift slice, m therefore represents the angle-independent 
value of Aa/a. 

Clearly a model of this form is unphysical — it makes no account for any redshift depen- 
dence. Clearly, A = A(z). Nevertheless, a model which includes only angular dependence 
is useful because it provides a method of detecting spatial variations in a which does not 
require the specification of a functional form for A(z). Use of this model is valid under 
several possible circumstances. One is where any variation in a with redshift in the sample 
along a particular direction is small compared to variation in a in the opposite direction. 
This might be possible in our sample, depending on how a might vary, as we typically 
probe lookback times of greater than 5 gigayears. Another is where a does vary signif- 
icantly with redshift in our sample and the distribution of absorber redshifts does not 
vary greatly with sky position. With enough data, one could simply take redshift slices 
and apply this model to each redshift slice, thereby building up the functional form of 



Aa/a = ^cos(O) -|- m 



(4.19) 



Aa/a = c ■ X + m, 



(4.20) 
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A(z) in model-independent manner. However, given that we have only ~ 300 absorbers 
between the Keck and VLT samples, we simply cannot slice the data enough to do this for 
more than ~two redshift bins. Another consideration is the effect of choosing a particular 
form of A(z). An incorrect choice of A(z) may reduce sensitivity to detect an effect, and 
could lead to the wrong conclusion if the choice is sufficiently bad. As a result, we ex- 
plore an angle-dependence model initially, and later consider explicitly including distance 
dependence. 

Uncertainty estimates on dipole locations are derived by transforming the covariance ma- 
trix from our fit in rectilinear coordinates, {cx,Cy, Cz) to spherical coordinates (r, 4>, 9) using 
the standard Jacobian matrix. That is, if J is the Jacobian matrix of the transformation 
from rectilinear to spherical coordinates, and C is the covariance matrix calculated from 
the fit, then C = J • C • j"^ gives the approximate covariance matrix in spherical coor- 
dinates. The radial component, r, corresponds to the amplitude of the dipole, whereas 
(p and 9 can be converted to the RA and dec of the pole of the dipole. Our errors on 
RA and dec are thus linearised approximations based on the covariance matrix at the 
best-fitting solution, and should be regarded only as approximate. These error estimates 
will be inaccurate if they subtend a large fraction of the sky. 

Note that, by virtue of the fact that r > in spherical coordinates, the dipole amplitude. A, 
is not Gaussian. Thus, we perform a resampling bootstrap analysis (Press et al., 1992) to 
derive an uncertainty for a dipole amplitude. Similarly, one cannot use a t-test to determine 
if A is significantly different from zero. Thus, we calculate the statistical significance, 
1 — p, of the dipole model over the monopole model by using a bootstrap method where 
we randomise Aa/a values over sightlines, and from the observed distribution of over 
many iterations determine the probability that a value of as good or better than that 
given by our observed dipole fit would occur by chance. One can also use analytic methods 
(Cooke & Lynden-Bell, 2010) if desired. These methods should yield similar answers for 
large sample sizes. However, for small samples sizes, the results may differ somewhat 
(especially if the statistical uncertainties vary significantly in magnitude between the Aa / a 
values). As the dipole -|-monopole model will always improve the fit over a monopole model, 
the statistical test is one-tailed, and so when we state the c-equivalence of a statistical 
significance, this is calculated as |probit(p/2)| in order to accord with conventional usage, 
probit is the inverse normal cumulative distribution function. 

In principle, one can use penalised likelihood methods to determine which model is pre- 
ferred, however these information critera (e.g. the AICC) are only heuristics, and have 
some drawbacks (Liddle, 2007). The bootstrapping approach described above yields a 
direct estimation of the preference for a dipole-|-monopole model over a monopole-only 
model, and so we use that method here. 

Unless otherwise mentioned, we multiply uncertainty estimates on monopole values and 
sky coordinates by \/x^ as a first-order correction for over- or under-dispersion about the 
fitted model (Press et al., 1992). 
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We have checked that our optimisation code is performing adequately by rotating the 
Aa/a data sets into different coordinate frames. Although clearly this will change the 
error estimates on the angular position, the statistical significance tests and the value of 
the dipole amplitude should not be affected, and this was found to be the case. 

4-4.8 Estimating random errors &; the Least Trimmed Squares (LTS) 
method 

4-4.8.1 Error bar inflation &z over-dispersion 

A common problem with the analysis of observational data is that the observed scatter 
about the model is too high to be accounted for by the model. This can either be caused 
by an incorrect model or, if the model is a good approximation to the true underlying 
process, random and systematic errors. For data with Gaussian statistical errors, this 
effect is revealed by > 1 for large u. Indeed, when modelling Aa/a as a function 
of time and space, we expect over-dispersion about any simplistic model, as the true 
functional form of the variation is unknown. Therefore, over-dispersion about a particular 
model will reflect not only unmodcUed systematic effects in the observations, but also 
some element of model mis-specification. This docs not render the modelling useless — 
detection of effects in reasonable models at high enough statistical significance is still a 
demonstration of an underlying deviation from known physics. However, it is a reminder 
that all the models presented here must be considered approximations at best. 

One solution to this problem is to use an unweighted model, thereby allowing the dispersion 
of the model to determine the implied model errors. Whilst this is valid in a systematic- 
dominated regime, typically one operates somewhere between being statistically dominated 
and systematic-dominated. In this case, an unweighted model is inappropriate, as it 
ignores legitimate statistical information. The ideal solution is to model the influence 
of the systematic error, however this is not always possible, particularly in Voigt profile 
analysis of quasar absorption lines, where certain systematics can be difficult to quantify 
a priori. 

Suppose that measurement values yi arise from true values Yi on account of observational 
statistical scatter. The probability of observing yi given is then 



Now suppose some unknown random effect with uniform size errand causes extra scatter be- 
yond that caused by statistical errors. If the model prediction is /(x)i, then the probability 
that a true value Y; arises from scatter about the observed model is then 




(4.21) 



Pr{Yi\f{:K)i) 



0"randv27r 



1 



exp 



-[Yi-f{^U' 



) 



(4.22) 
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In this case, the probabihty of measuring yi given the model is then (Cooke & Lynden-Bell, 
2010) 



Pr{yi\Yi)Pr{Yi\f{^))dY, 



: exp 



-[yi - f{x.)i 

.2 I ^2 



This then yields the log- likelihood, ln(L), which leads to &s 

(/(x)i - yif 



(4.23) 
(4.24) 



ln(L) = In 



n^r(y,|/(x) 



2 I ^2 



(4.25) 



rand 



For large i^, the likelihood maximum of occurs for = 1. Although one can in principle 
determine the maximum of equation 4.25 directly, this requires non-linear methods even 
for linear models. A more practical option is to slowly add a term cJi-and in quadrature 
with the observational uncertainties on Aa/a values, finding the minimum of the 
model under consideration at each iteration, until xt ~ ^ about the fitted model (i.e. 
'^tot = '''stat + '^rand)- This method has been used previously to attempt to estimate the 
size of a random error, or aggregation of random errors, responsible for any extra scatter 
observed data (Murphy, 2002; Murphy et al., 2003a, 2004). Note that this assumes that 
all data points are equally affected by the same random errors, which is unlikely to be true 
in practice. Therefore, it is prudent to attempt to identify subsamples which are affected 
by different random errors and correct them independently. It is also worth noting that 
because of the propagation of uncertainty for Gaussian errors, then a is the aggregation of 
any series of Gaussian random errors with zero expectation value {a = ^J2j providing 
they are uncorrelated. 



4-4.8.2 Other robust methods 



The method described in the previous section works well provided that one truly believes 
that the random errors which affect all Aa/a values have the same underlying process. 
For quasar absorption line Voigt profile fitting, this is unlikely to be true. We fit profiles to 
a wide range of systems of varying species, some with substantial ranges in optical depth. 
Additionally, different parts of the spectrum are affected by different issues. In particular, 
the red end of the spectrum displays significant sky absorption and emission, the presence 
of which cannot be unequivocally excluded in certain cases, particularly where the spectral 
region is of relatively low SNR. Additionally, we cannot assume that certain processes are 
described by a Gaussian. Some events are binary (e.g. sky emission is either present or it is 
not, although clearly the magnitude of the impact could vary substantially). Furthermore, 
certain events may occur with low probability most of the time but have significant impact 
(e.g. uncleaned cosmic rays). All of these considerations lead to the possibility of outliers 
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in the sample (that is, values of Aa/a which do not match the trend shown by other 
absorbers). 

Outliers cause two problems. Firstly, they bias parameter estimates away from underlying 



residuals (r^ = [Aa/a — model prediction] /cjtot), even a few large-residual points can cause 
substantial bias in parameter estimates. Secondly, any estimate of the average random 
error from growing the error bars in quadrature with some cirand will not be a good estimate 
of the average random error affecting the good points; it will over-estimate the random 
error affecting most points, whilst under-estimate the systematic affecting the outliers. A 
traditional solution to the second problem has been to manually remove outliers from the 
sample, typically by discarding points with |rj| > 3. However, because the outliers are 
included in the fit, they tend to bias the fit towards them. This tends to mask the presence 
of other outliers, and may lead to the rejection of good data. Additionally, because one 
has to estimate fXrand before calculating the residuals, the overly-large estimate of cJrand 
will tend to mask outliers. This can lead to both false positives and false negatives. 

Although there is no perfect solution to this problem, these considerations have led to the 
development of robust statistical methods (see Rousseeuw Sz Leroy, 1987, for a review of 
many of the basic approaches). Although these methods mildly underperform standard 
least squares methods in the presence of no contamination, for data sets with contamina- 
tion of even a few percent robust statistical methods can lead to dramatic outperformance 
(Rousseeuw & Leroy, 1987). 

A common method is to use a so-called M-estimator, which minimises a maximum- 
likelihood type estimate of the residuals, Yl,iP{fi)- For standard least squares, p{x) = x^. 
Choosing p{x) = \x\ leads to the Ll-norm method of minimising the mean absolute de- 
viation of residuals (Rousseeuw & Leroy, 1987; Press et al., 1992). This corresponds to 
a probability distribution where the residuals are distributed as a double exponential, 
namely 



(Press et al., 1992). Other functions do not correspond to traditional probability dis- 
tributions, but instead are heuristic functions derived to have robustness against outliers 
whilst still maintaining good statistical efficiency. A widely used choice is Tukey's biweight 
(Rousseeuw, 1984), 



As X — >■ 0, p{x) ~ and so this approximates standard least squares fitting. On the other 
hand, for |x| — >■ c, p{x) — )■ Thus the effect of outliers is bounded. The effect of outliers 

on parameter estimates relates to the function ^(x) = p'{x). For Tukey's biweight, ^'(x) = 
for |x| > c and therefore these outliers have no influence on parameter determination, 
which is desirable. Unfortunately, application of M-estimators requires that the expected 



values. As minimisation weights points by the square of the weighted normalised 




(4.26) 




(4.27) 
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scatter of data about the model be known. If random errors are significant, then naive 
application of a M-estimate to statistically weighted data will simply discard many points 
which are not necessarily outliers when considered in the context of the observed scatter 
of the points about the model. 

One solution to this is to use an S-estimator (Rousseeuw & Yohai, 1984), which attempts 
to obtain a robust estimate of scale. To do this, one defines a robust estimate of scale, s, 
as the solution of 

1=1 

for some K. K is typically set to be equal to the expected value of p{x) under a Gaussian 
distribution, that is 

/oo 1 
p{x)^e"-' l^dx. (4.29) 
~oo \/27r 



For X minimisation, p{x) = x and K ^ \. 

There are two problems with this in a scientific context. Firstly, s enters reciprocally in 
the functional form of K in equation 4.28, implying that s simply scales errors by a specific 
amount, which is undesirable. Certainly, the statistical errors impose an absolute lower 
bound on the precision available from each data point. The solution to this is to solve the 
implicit equation 

for a and x,. Unfortunately this improvement leads to the second problem. One can 
rapidly determine through experimentation that even a few outliers of arbitrarily large 
magnitude can substantially influence the estimate of cr, because for |rj| > c, p{ri) may 
not change significantly upon increasing a by reasonable amounts. We thus found that 
the application of S'-estimators was not appropriate for our purposes. 



4-4.8.3 Bayesian methodology 

Another approach is to use Bayesian methods, which can formally account for an uncer- 
tainty in the error estimates, cjj. Suppose that the statistical error bar, cjj, is taken as a 
lower bound on the true error bar. One way of incorporating this approach is to take a 
prior PDF for the true error bar, Ui^t as 

Pr(ai,t|c7i,M) = ^ (4.31) 

where M is the model used. (Silvia & Skilling, 2006). A more correct approach is to assign 
a Jeffreys' prior, as 

Pr(a,,t|a,,c,M) = / — , (4.32) 
ln(c/crj) ai^t 
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however this requires specification of a finite upper bound, c, to make the prior nor- 
mahsable. The choice of equation 4.31 should not substantially alter the conclusions of 
this analysis (Silvia & Skilling, 2006). Suppose we consider a single datum, Xi, then the 
marginal likelihood for the data, D, with the unknown ai^t integrated out is 

Pr(D|F, auM)= / Fi{D\F, M) Pr(ai,i|a„ M) da.^, 

where F is the function being estimated. If we assume a Gaussian PDF for Vi{D\F, ai^t, M) 
then we obtain 



FviD\F,ai,I) 



1 



where rj is the residual about the model, (data — model)/(Tj. Extending this analysis to 
a set of data of size A^, and assigning uniform prior PDFs to the parameters, the log 
posterior probability is 



N 



L = ln[Pr(X|D, /)] = constant + ^1 

k=l 



1 



-r2/2 



(4.33) 



(Silvia &: Skilling, 2006). Maximising L (or minimising —L) thus yields a robust estimate 
of the parameters under the assumptions described above. For brevity, we refer to this 
methodology as skeptical Bayesian regression. In the context of a linear fit, we call this 
skeptical Bayesian linear regression (SBLR). Unfortunately, the rather broad assumption 
about the validity of the {ai} values leads to a loss of statistical precision for the resultant 
uncertainties on model parameters in the event that the residuals are Gaussian. On the 
other hand, in the event that the residuals are not Gaussian, then the standard least 
squares assumption that the residuals are Gaussian will mean that confidence limits on 
model parameters are too small. In this case, the approach described here naturally gives 
a highly robust method of determining parameters whilst making use of all the data. We 
return to this method later. 



4-4.8.4 The LTS method 



Ideally, what one would like to do is identify outliers and remove them from the sample of 
consideration. This not only means that they cannot bias parameter estimates, but that 
the estimate of a is likely to be much more reasonable. For this, we have found that the 
Least Trimmed Squares (LTS) method (Rousseeuw, 1984) works well. Instead of fitting 
all n data points, the LTS method traditionally only fits k = {n + p + l)/2 points (where 
p is the number of free parameters) using standard least squares, and then searches for 
the combination of k data points and fitted model that yields the lowest sum of squared 
residuals. In our case, we modify this to include statistical weightings on the data points. 
We thus wish to find the combination of k data points and model which minimises x^- 
Essentially, the method only fits the inner k/n fraction of the distribution of the residuals. 
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Where a few outliers exist, they will be ignored by this method provided that they are in 
the excluded fraction. 

Calculation of the LTS is computationally intensive because the target function is highly 
non-linear on account of the inclusion/exclusion of data, as well as the need to sort the 
residuals. To directly explore all the (^) possible combinations is unfeasible for the datasets 
we consider. Original methods attempted to sample from this space using a forward 
search algorithm (Atkinson, 1994), however a newer algorithm — Fast-LTS (Rousseeuw 
&: Driessen, 2006) — demonstrates good results for hundreds to thousands of data points. 
We implement the Fast-LTS algorithm. 

Although in the limit n — ^ oo the use of A; = {n + p + l)/2 will produce a very robust 
fit, for small n (e.g. n < 20) we are wary of finding combinations of (n -|- p -|- l)/2 points 
by chance that do not reflect the true trend. However, we still wish to obtain robustness 
against outliers. As such, we operate with k = 0.85, which provides robustness against up 
to 15% of the data being outliers. 

To allow for the inclusion of (Jrand) we propose a variant of the LTS method which proceeds 
as follows. First, we define a robust scatter measure as 

2/7 \ 1 (/(x)j — yj)^ /A '}A\ 

X'-(^) = ^-L ^2 ,^2 ' (4-34) 

where the sum is taken over only the smallest k residuals of the fit. We then slowly increase 
o"rand from Until xt{k) is what we would expect for a Gaussian distribution with large 
ly, refitting and recalculating Xui^) after each increment in a. For k = n then this yields 
(x^(h)) = 1, for but for k < n we obtain 

{xlm= r x'-^e--'/Hx, (4.35) 

where a = probit[(l -|- /)/2] and / = k/n. probit is the inverse normal cumulative 
distribution function. We take the value of drand derived in this way as our estimate of 
the additional random error for the data given the model. In this way, if the data are 
contaminated by a few outliers these will not impact the estimate of the random error 
which affects most points. 

After applying the LTS method to estimate the random error term, we then discard all 
points with |rj| > 3 about the LTS fit, but only if we are applying the method to a full 
sample (i.e. the whole VLT or Keck sample, or a combination of the two). This is because 
in small-n fits one does not have much data, and so it is not clear whether outliers would 
become inliers with more data. 

If we remove outliers, we then reapply the LTS method to check that no more outliers 
are unmasked, and to re-estimate cxrand- The LTS fit is statistically inefficient because 
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it ignores some good data (15 percent for k = 0.85n if all the remaining points are in- 
liers). Therefore, after we discard high residual points, we apply a normal weighted least 
squares fit to the remaining data to estimate the parameters and achieve the best possible 
confidence limits on our modelled parameters. 

The benefits of the LTS method can be summarised as follows. 

1. Robust estimate of (Trand- If we calculate (Jrand by increasing it until = 1, then 
even a single, arbitrarily large outlier can increase cxrand without bound. This is 
much less likely with the LTS method. A more appropriate estimate of errand means 
that false negatives are less likely. 

2. Robust detection of outliers. In a standard minimisation fit, residuals with larger 
magnitude |rj| are weighted as rf, which distorts the fit towards them. This tends 
to mask outliers. By distorting the fit, one might incorrectly decide that some good 
points are in fact outliers. Similarly, the existence of one outlier tends to conceal the 
existence of additional outliers (a masking effect). 

3. Objectivity. Manual outlier rejection is often characterised as subjective. The rule 
provided here provides an objective method of classifying data points as outliers, 
thereby removing this objection. 

4. More robust parameter estimates. Even a few outliers can substantially distort the 
fit. This biases parameter estimates away from their underlying values. We are 
interested in the underlying values, not the values given by a blind least squares fit. 
The rate of false positives should also be decreased, as false positives can be caused 
by outliers. 

4-5 Many- mult iplet VLT results 

We give our many-multiplet VLT results in table 4.2. The values of Aa/a for the VLT 
sample are shown in figure 4.9. The distribution of observed wavelengths for certain 
representative transitions occur can be found in table 4.7. The frequency with which all 
utilised transitions are fitted is given in table 4.3. In figure 4.8 we show the relationship 
between the q coefficients and observed wavelength for all the transitions fitted in the VLT 
sample. A summary of the parameters for various models for Aa/a fitted to the VLT and 
Keck data may be found in table 4.4. Plots of the fit to each absorber may be found in 
Appendix E on page 307. 



120 



Table 4.2: Results for /S.a/a derived from MM absorbers. Errors given are purely statistical (see definition in section 4-4.3). The emission 
redshift of the quasar and absorption redshift are given by and 2abs respectively. Zabs is given as the redshift of the strongest component in 
the fit. Three values of Aa/a are given: the turbulent fit, the thermal fit, and the value derived from the method-of-moments estimator from 
section 4-4.6. The key for the transition labels, in column 4, is given in table 4.3. ^"^^ ^i t^is number of degrees of freedom, are given for both 
the turbulent and thermal fits. The last column gives the value of Act/a derived using our method-of-moments estimator. The absorber marked 
with an asterisk (*) has been identified as an outlier and removed from the analysis. 





z 




2abs 


TrcLnsitions 


A<rv /rvt.iyK fir 


) 




K 

A(/,turb 




i^thcrm 


2 

X/y, therm 


Aa / a-Mir^-hA ( 10 ^) 


T000344— 232355 


2 


28 





4521 




—0 498 ± 


758 


533 


0.7653 


—0 459 ± 787 


533 





7466 


—0 459 ± 787 


J000344— 232355 


2 


28 





9491 


^'^^h^ no lA if; IQ 


— 1.390 ± 2 


700 


530 


0.7766 


—4 960 ± 2 650 


531 





7916 


—1 534 ± 2 788 


Tnnn344— 232355 


2 


28 


\ 






— 1.630 ± 1 


040 


380 


0.7061 


—0 389 + 989 


380 





fi847 


—0 410 ± 1 003 


Tnnn44S— 41 5728* 


2 


76 


\ 


5419 


hn i A Tc li; To 

^1 ^2J4JbJ6j7J8 


—5 270 + 


906 


372 


0.7682 


— 1 920 + 741 


372 


\ 


0146 


—5 270 ± 906 


T000448— 415728 


2 


76 


1 


9886 


1 A ia 'i'y To /7i fin 


—0.952 ± 1 


510 


254 


0.8991 


934 ± 1 830 


254 





8943 


0.266 ± 1.945 


J000448— 415728 


2 


76 


2 


1679 


hi 1a 1a Ik Ic^Ca Pa 


2.320 ± 


942 


309 


0.7517 


1 370 ± 938 


309 





7229 


1.381 ± 0.944 


J001210— 012207 


2 


00 


1 
± 


zuou 


Ha hn 1a 'J(^'J*7'Jo^i 


0.723 ± 1 


290 


505 


0.9464 


772 ± 1 190 


505 


n 

u 


SQ'iQ 


772 ± 1 190 




2 







6351 


bi ^2 J4 is 


— n 71 Q + 'H 


iJOU 




1.1152 


_n AQQ _|_ Q AAA 




1 


1318 




J001602-001225 


2 


09 





6363 


&l62j4j6j8 


-2.130 ±4 


310 


184 


0.8184 


-1.460 ±3.830 


184 





7997 


-1.561 ± 3.914 


J001602-001225 


2 


09 





8575 




1.140 ± 1 


850 


191 


0.8713 


1.490 ± 1.760 


191 





8773 


1.266 ± 1.826 


J001602-001225 


2 


09 


1 


1468 


blb2j6js 


-2.490 ± 3 


720 


159 


1.1667 


-1.580 ± 2.920 


159 


1 


0859 


-1.581 ± 2.922 


J001602-001225 


2 


09 


2 


0292 


a2jii4j6i7j8dirf2ei 


-0.909 ±0 


934 


425 


0.9288 


-2.780 ±0.877 


425 


1 


0167 


-0.909 ±0.934 


J004131-493611 


3 


24 


2 


1095 


bib2j7j8Cidid2 


-1.090 ±2 


950 


430 


0.6242 


0.980 ±2.590 


430 





6200 


0.386 ±2.856 


J004131-493611 


3 


24 


2 


2485 


jij2j3j6j7js,cidid2e2h\h2h'ihl2kiiii2 


-1.230 ±0 


672 


756 


0.7072 


-0.972 ± 0.669 


757 





7281 


-1.230 ±0.672 


J005758-264314 


3 


65 


1 


2679 


a2bib2j(,j7js 


1.660 ±2 


120 


209 


0.8127 


0.416 ± 1.430 


209 





8139 


1.076 ± 1.931 


J005758-264314 


3 


65 


1 


5336 


b2j3jA35j&37hlili2 


-0.456 ±0 


903 


306 


0.9050 


-0.151 ±0.769 


306 





9980 


-0.456 ±0.903 


J010311+131617 


2 


68 


1 


7975 


02^1^4^5^662 


0.422 ±0 


537 


371 


1.1633 


0.964 ±0.560 


372 


1 


1841 


0.443 ±0.548 


J010311+131617 


2 


68 


2 


3092 


j\j2Ciexe2hih2h3hl2kik2 


-0.082 ±0 


563 


672 


1.0917 


-1.690 ±0.521 


673 


1 


2312 


-0.082 ± 0.563 


J010821+062327 


1 


96 


1 


9328 




2.200 ± 2 


460 


262 


1.2053 


0.782 ± 1.230 


262 


1 


2392 


2.184 ± 2.454 


J011143-350300 


2 


41 


1 


1827 


a2bib2jije.j7js 


0.150 ±0 


953 


438 


0.7192 


-0.038 ±0.870 


438 





7336 


0.142 ±0.950 


J011143-350300 


2 


41 


1 


3499 


blb2j4j5j(ij7j8 


0.084 ± 


378 


1037 


0.8889 


1.160 ±0.391 


1043 


1 


2275 


0.084 ±0.378 


J012417-374423 


2 


20 





8221 


a26l62j4j6j7j8 


0.702 ± 1 


050 


214 


0.7919 


1.080 ±0.826 


214 


1 


2128 


0.702 ± 1.050 


J012417-374423 


2 


20 





8593 


a2blb2j5j6j7js 


0.768 ± 1 


870 


261 


0.9144 


-2.660 ± 1.850 


261 





9169 


-0.677 ±2.516 



Continued on next page. 



Quasar name 


-2^6111 




Transitions 


Aa/aturb (10 




i^turb 


X 


2 


J012417- 


-374423 


2 


20 


1 


2433 




1.140 ± 1 


050 


364 





8248 


J012417- 


-374423 


2 


20 


1 


9102 


jAjiiC\d\d2e.i 


-4.730 ± 3 


280 


248 





6603 


J013105- 


-213446 


1 


90 


1 


8566 


73^4 jsjedi d2 62 /ii fci fca 


0.304 ± 1 


430 


989 





8841 


J014333- 


-391700 


1 


81 





3400 


h\ 62 jV J8 


-7.010 ±6 


020 


208 





8698 


J014333- 


-391700 


1 


81 


1 


7101 


61 j4 j6 Jsci di d2ei 


-2.200 ±2 


670 


869 





6519 


J015733- 


-004824 


1 


55 





7693 


&l62i6 


2.350 ±4 


350 


89 


1 


0343 


J024008- 


-230915 


2 


22 


1 


1846 


026162^6^8 


-1.680 ±3 


620 


300 





7229 


J024008- 


-230915 


2 


22 


1 


6359 


a2 61 h2 jijAjjjsdi (^261 


1.000 ± 1 


110 


388 





7883 


J024008- 


-230915 


2 


22 


1 


6373 


026162^1^4^7^8 


-0.187 ±1 


020 


475 


1 


1982 


J024008- 


-230915 


2 


22 


1 


6574 


6i62j4j6Ciei 


-0.137 ± 1 


010 


504 





8087 


J033106- 


-382404 


2 


42 





7627 


0261 62 J6 J6 J7j8 


1.250 ±0 


835 


395 





8804 


J033106- 


-382404 


2 


42 





9709 


6162J8 


-4.970 ± 4 


460 


53 





5507 


J033106- 


-382404 


2 


42 


1 


4380 


6162J4J8 


-4.120 ±2 


530 


384 





7541 


J033108- 


-252443 


2 


69 





9925 


6l62j4i8 


-0.210 ±1 


480 


270 





7501 


J033108- 


-252443 


2 


69 


2 


4547 


JlCl 


-1.100 ±6 


490 


203 





9689 


J033244- 


445557 


2 


60 


2 


4112 


7i74 76 77Cieie2 


-1.000 ±0 


793 


340 





9885 


J033244- 


445557 


2 


60 


2 


6563 


?i 74 75 76Ciei 


1.080 ± 1 


690 


308 





9847 


J040718- 


-441013 


3 


00 


2 


4126 


j5j6jsCldld2 


2.420 ± 2 


220 


250 





8800 


J040718- 


-441013 


3 


00 


2 


5499 


jijij6Cieie2kik2 


0.895 ± 


353 


931 


1 


0024 


J040718- 


-441013 


3 


00 


2 


5948 


iiizjs ieci <iid2 6162/11/12 /i3i2A;ife2 


0.574 ± 


345 


865 





8559 


J040718- 


-441013 


3 


00 


2 


6214 


iicici 


4.350 ± 2 


920 


325 





8567 


J042707- 


-130253 


2 


16 


1 


4080 


blb2jlj4j5j6j7jsCl 


-2.550 ± 1 


110 


335 


1 


0038 


J042707- 


-130253 


2 


16 


1 


5632 




-2.640 ± 2 


520 


114 





7238 


J042707- 


-130253 


2 


16 


2 


0351 


jiC\e2 


8.060 ± 3 


830 


324 





8410 


J043037- 


-485523 


1 


94 


1 


3556 


02^1^2.73.74.7576^7^86161 62/11 /la feifc2fc3iii2i3 


-0.405 ± 


232 


1039 





9764 


J044017- 


-433308 


2 


86 


1 


4335 


61 69 74 77 78 


0.139 ± 2 


500 


472 


1 


1308 


J044017- 


-433308 


2 


86 


2 


0482 


jlj2 j4Cl 62/11/12 /l3'2fclA;2fc3i2i3 


1.400 ± 


864 


1595 


1 


3448 


J051707- 


-441055 


1 


71 





2223 


0261 62 jg 


1.380 ±3 


850 


359 





6694 


J051707- 


-441055 


1 


71 





4291 


a26lj4i6i7i8 


-2.740 ± 1 


440 


273 





5915 


J053007- 


-250329 


2 


81 


2 


1412 


61 62 jl J2 JB 61 /ll /Jl /C2 A;3 


0.676 ± 


359 


949 





8670 



Continued on next page. 



/Q:therm(10 ^) I/therm X^,therm Aa/aMoM(10 ®) 



2 




1 


120 


ouo 





8974 

OZ ( 


\ 


8*^8 -1- 

000 ZIZ 


\ 


221 


Q 




9 


OOU 


z^o 


n 

u 


Uu ( u 


Q 
— 


879 + 
1 z ^ 






111 


n 


91 9 + 


1 

J. 




QQO 


n 


000^ 


n 

u 


zou ^ 


1 

J. 


44^ 


u 




Q 



97n 

Z i u 


90S 
zuo 





OU JZ 


— fi 
u 


748 + 


Q 



914 


1 

1 




9 


oiu 


8RQ 
ouy 


n 

u 


RzlR7 

U4;U / 


1 

1 


4R^ + 

4;UO ZIZ 


9 
z 


00 ^ 


Q 




/I 


1 /to 


8Q 
oy 


1 

± 


O/l'l'^ 

U^uO 


9 
z 


R47 + 

U^ i ^ 


4 


988 
zoo 


1 

J. 


47n + 



z 


^ou 


^00 
ouu 


n 


71 


— J. 


^1 ^ + 



z 


7^4 


u 


uuo ^ 


n 

u 




000 


n 

u 


8989 
ozoz 


1 

J. 


000 + 
uuu ^ 


1 

X 


110 

xxu 


4 




]^ 


1 7n 


47^ 


\ 


9R0^ 
zuuo 


— 
u 


1 87 + 

± ( ZIZ 


\ 


090 
uzu 


9 




1 


070 




n 
u 


Ql "lO 
yiuu 


n 
— u 


1 '^7 + 
10 ( HZ 


1 


01 
uxu 


n 

u 


994 + 


n 


Q1 


oyu 


n 

u 


878f; 
1 ou 


u 


440 + 

^^U HZ 


n 

u 


Q88 

yoo 


4 







Q90 
yzu 







'i48'H 

04:00 


—4 


48"! + 

4:00 ZIZ 


4 


216 


A 


OOU ZIZ 


9 
z 


^iSO 

OOU 


000 


n 

u 


/ ozi 


A 


OZO ZIZ 


9 
z 


^171 
/ 1 


n 


tJ±U ZIZ 




9'^0 
zou 


970 

Z ( u 





70Q1 
( uyj- 



u 


±0 ZIZ 




232 









1 70 


90^ 
zuo 


n 


yoo ( 


_9 
— z 


1 99 + 
± zz ^ 





^yu 


n 

u 


zyo ZIZ 


n 

u 


809 
ouz 


0^1 


1 

1 


070 
u t uo 


1 

— 1 


000 + 

uuu ZIZ 


n 

u 


/ yo 


n 


04;U ZIZ 


\ 


^^0 


■^08 


]^ 


01 7Q 
u± ( y 


\ 


07Q + 

u ( y ZIZ 


]^ 


R8Q 
uoy 





970 + 

Z / U ZIZ 





7R0 


9^0 
zuu 


1 

± 


0409 

U4UZ 


9 
z 


490 + 

^ZU ZIZ 


9 
z 


990 
zzu 


2 


9Qn + 


n 


00 X 


Q^4 




Q49Q 
yizy 



u 


8Q^ + 
oyo ^ 



u 


000 


n 
u 


UUU ZIZ 


n 
u 


OUl 


8RQ 
ouy 


n 

u 


QQR'^ 
yyuo 


n 

u 


"^74 + 
^ 4 ZIZ 


n 
u 


040 


4 




9 
z 




ozo 


n 


oouo 




9fi4 + 
zu^ in 


9 
z 


744 


2 




]^ 


120 


00 J 


]^ 


09fi1 
uzux 


_2 


^^1 + 


]^ 


110 

xxu 


Q 





z 


4^0 


114 

J. j.^ 




uyo J. 


— z 


yu t ^ 



z 


44Q 
^^y 


K 



QQD + 

yyu ZIZ 


Q 


R90 
uzu 


OZ4; 


n 

u 


8899 
oozz 


8 



0'i7 + 

uo / ZIZ 


Q 



890 

OOU 





428 ± 





237 


1041 


1 


0921 


— 


405 ± 





232 


1 


910 ± 


2 


130 


476 


1 


1940 





139 ± 


2 


500 


2 


510 ± 





773 


1595 


1 


3806 


1 


400 ± 





864 


1 


140 zt 


3 


540 


359 





6696 


1 


262 zt 


3 


703 


3 


480 ± 


1 


470 


273 





5898 


-3 


153 ± 


1 


502 





865 ± 





349 


949 





9194 





676 ± 





359 



Quasar name 


z 


3m 


^abs 


Transitions 


Aa/aturb (10-') 


t'turb 


X 


2 

fjturb 


Aa/atherm (10-^) 


^therm 


Ai/, therm 


Aa/oMoM (10"') 


JUOOZ4o— ODO lZ{ 


Z 


OZ 


1 


2252 


(1261 &2 j4 js je jVci ei 


1.850 ± 1 


010 


OQ1 

o8i 


A 
U 


'71 OA 
/ izU 


A OCA _l_ A 

0.269 ± 


895 


OQ1 
08 1 


A CI /I '7 
U.D14/ 


A OCA _l_ A 

0.269 ± 


895 


J 055246— .50.5727 


2 


32 


1 


7475 


ffl2jlj4j5j6j7j8ei 


—0.795 ± 1 


080 


305 





7899 


—2.180 ± 1 


050 


305 


A OA /I 

0.8042 


A A OC _l_ 1 

—0.936 ± 1 


155 


JUooz4d— ODO (2( 


o 
z 


3z 


i 


9o60 


jij4,jecidid2ei 


A 1 A/1 _1_ 1 

—0.104 ± 1 


K AA 
OOO 


000 
283 


i 


1 OAA 

12U9 


1 T/l A _1_ 1 

1. (^4U ± 1 


K OA 

530 


000 
283 


1 AKAO 

1.0508 


1 T Art _i_ 1 
1.1 W ± 1 


K OA 

530 


JUd4oZO— dU411z 


o 
O 


09 


2 


6592 


J4 J6 JeCl dl (12 62 < 1 Kl A;2 


—1.530 ± 1 


920 


1 1 A1 


-1 
1 


1 01 A 

1219 


0.601 ± 1 


780 


1 1 A1 
llUl 


l.lDll 


—1.530 ± 1 


920 


JUyiDlo+U;^Uzz4 


o 
z 


1 1 


1 


3324 


026162^4^6^7^8 


/I A 1 


380 


CO 

3oz 


A 



(1(1 


1 AAA 1 A 

12.900 ± 4 


690 


K 

353 


A TOO 1 

0. 1 Z6l 


0000 1 C 


915 


JUy4zoo— iiU4zD 


3 


Oo 


i 


AK A K 

0o9o 


0261^4^7^8 


A QTO _l_ A 


TOT 


OTA 

2 (U 


A 



TOTA 

(3(0 


A /I TO _l_ 1 

—0.4(^0 ± 1 


A/? A 

OdO 


Of; A 
2d9 


1 AKO /I 

1.0524 


A TO _l_ A 


TOT 
161 


J Uy4z oo— 1 1U4ZD 


3 


Oo 


1 


7891 


0162J4J5J6 


OOA 1 A 

—2.330 ± 


495 


AA 

399 


A 



T/l OA 

(42U 


r ,1 A 1 A 

—5.240 ± 


487 


/I A/l 

4U4 


1 AOA 

2.1U29 


OOA 1 A 

—2.330 ± 


495 


T1AOAAA 001 OOC 

J iUoyuy— zoiozo 


3 


13 


-1 
i 


A A OA 

44z9 


01J7J8 


1 AOA _1_ 

— 1.980 ± z 


'70A 
/zu 


1 CO 

i08 


A 



694d 


CAA _1_ 

—2.090 ± 2 


OAA 

2U0 


1 CO 

158 


A AOOO 

0.92d8 


1 AOA _1_ 

—1.980 ± 2 


'70A 

IZU 


T1AOAAA 001 OOt? 

J iUoyuy— zoiozD 


3 


1 o 

13 


2 


7778 


^' ^' ^' ^ ^ K 7 7^ 7^ 7^ 
:? i:?2 J4C1 ftl ft2 /I3 12 fcl K2 


— 1.130 ± 


660 


OOA 

889 


A 



00 A ,i 

8394 


—0.755 ± 


657 


0A1 
891 


0.86(^6 


—1.130 ± 


660 


T1A0A01 TTT A1 

J iUoyzi— z / lyio 


o 
z 


zj 





8771 


0262^6^8 


1.750 ± 2 


020 


1 OA 

139 


-1 
i 


AAA^i 

0096 


AOA _l_ 1 

o.yzu ± i 


170 


1 OA 

139 


1 AOAC 

1.0305 


1 CA _l_ 

z.ioy ± z 


071 


J iUoyzi— z/ iyio 


o 
z 


oo 
zd 


-1 
i 


AAAO 

0093 


7, I, ^' 
O1O2J8 


A 1 '7 A _1_ A 
— O.i/4 ± 4 


1 AA 
190 


1 CO 

ioz 


A 



OOAO 

8892 


A CC _L 

— U.DDZ ± 


OCA 

2d0 


1 CO 

152 


A 00 '7 A 

0.83/^0 


r\ C AO _L 
— U.d4o ± 


OOA 

280 


T1A0A01 O'TI ril C 

J iUoyzi— z/ lyio 


o 
z 


oo 


1 


9721 




2.650 ± 1 


030 


OOA 

339 


-1 
i 


000'7 

383/ 


AOA _l_ A 

2.980 ± 


OAT 

847 


OOA 

339 


1 1 CI 

1.1613 


AOA _l_ A 

2.980 ± 


OAT 

847 


J iU4U0Z — Z IZi'^\) 


z 


oo 
3z 


1 


3861 


0261 62 jsjsjejVjsei/ii 713^1^2^311125455 


n A A 1 A 

0.446 ± 


693 


A 1 /I 

914 


-t 
i 


2514 


—0.565 ± 


734 


A1 

918 


1 OA1 A 

1.2919 


n A A 1 A 


693 


J iU4Uoz— z /z /4y 


z 


oo 
3z 


i 


/ /6i 


6i62iii4i5i6Ciei 


A Oi?0 _1_ 1 

O.zoz ± 1 


OOA 

3zU 


A OA 

4JU 


-1 
i 


A1 /t 

U124 


A Ti a _i_ 1 
0./16 ± 1 


At C\ 

41U 


A OA 

4JU 


1 A/t AA 

1.0499 


A 0/?0 _l_ 1 

U.ZOZ ± i 


OOA 

32U 


T11AOOC OCylCIC 

J iiUoZD— Z04DiD 


o 
z 


io 


1 


1868 


026162^4^5^6^7^8 


—1.110 ± 


695 


01 A 

814 


A 



CTAA 

8/99 


—0.155 ± 


945 


01 A 

814 


A 001 1 

0.8811 


— U.745 ± 


925 


T11AOOC OtIylCIC 

J iiUozD— zo4Dio 


z 


io 


1 


2029 


0162J4J6J7J8 


f\ coo _l_ A 

U.b22 ± 


831 


oco 
368 


A 



'7/1 'TO 


A '7CA _l_ A 

0.769 ± 


659 


0C7 


A '7CCO 

0. ibbo 


A COO _l_ A 

0.623 ± 


830 


J iiUoz5— ZD45i5 


2 


15 


1 


5515 


6l62i4i6i8Ciei 


—0.691 ± 1 


010 


343 





7530 


—0.619 ± 


967 


343 


0.7579 


A ^?^?A 1 A 


998 


J iiUozo— zo4oio 


z 


lo 


i 


OOOA 

o3o9 


02^1^4^561 


A fil _1_ A 

0.612 ± U 


OAC 

390 


1 A 

319 


A 



AO C /t 

9354 


A Add _1_ A 

0.4U6 ± 


A AA 

4U9 


1 
318 


1 A1 1 

1.0113 


A 1 _l_ A 

U.oiz ± U 


OAC 

395 


J iiiiio— UoU4Ui 


6 


rio 

yz 


o 
3 


CA^^ 

tUl 1 


jlCl 


^7 nAn _l_ £; 
/ .04U ± 


CAA 

690 


70 


A 
U 


fit la 
01 lb 


01 /I A A _1_ 1 

31.4U0 ± 13 


AAA 
000 


TA 


A COOO 

U.b6Z6 


00 ACO _L 1C 

zz.yoz ± io 


1 A 

134 


T110A1A lOylCOC 

J iizUiU— io4oz5 


3 


90 


1 


6283 


k ™ A A 


0.886 ± 1 


130 


L( ( 


-1 
i 


ACA/I 

0594 


— 1.270 ± 1 


170 


1 '7'7 


1 01 c c 
1.2155 


A OC _1_ "1 

0.886 ± 1 


130 


J iiz44z— i /Uoi / 


Z 


/I A 

4U 





8062 


blb2jijsj(ij7j8 


— 1.260 ± 


801 


Oil 


A 



9304 


1.900 ± 1 


200 


6(8 


A AO /I A 

0.924U 


1.738 ± 1 


373 


J112442— 170517 


2 


40 


1 


2342 


a2bib2j4j6jsdid2 


2.410 ±1 


540 


467 





9261 


1.880 ± 1 


590 


467 


A AO A/^ 

0.9306 


2.271 ±1 


571 


J 11 54 11 +063426 


2 


76 


1 


7739 


j4 j5 ^662 /ll /l2 /I3 ? 1 ^2 fcl /C2 fca il is 


-0.740 ± 


784 


625 





9578 


-0.154 ±0 


711 


627 


0.9825 


-0.739 ± 


784 


J115411+063426 


2 


76 


1 


8197 


6l62ili6j7j8Cl 


-0.948 ± 


974 


682 


1 


0724 


-1.080 ±0 


911 


682 


1.1191 


-0.948 ± 


974 


J115411+063426 


2 


76 


2 


3660 


jijVjsCiei 


3.090 ± 1 


780 


136 


1 


1989 


3.130 ± 1 


470 


136 


1.3170 


3.090 ± 1 


780 


J115944+011206 


2 


00 





7908 


6l62i4i6i8 


1.560 ± 1 


080 


170 





8878 


1.720 ± 1 


000 


170 


0.9531 


1.561 ±1 


080 


J115944+011206 


2 


00 


1 


3305 


6l62j4j6j7j8Cl 


1.970 ± 2 


670 


196 


1 


0164 


2.140 ± 2 


240 


196 


0.9765 


2.137 ± 2 


249 


J115944+011206 


2 


00 


1 


9438 


a2jij2j3j5eie2hih3kik2k3ii 


0.518 ±0 


442 


1031 





9472 


0.688 ±0 


433 


1035 


1.0251 


0.518 ±0 


442 


J120342+102831 


1 


89 


1 


3224 


ffl2 6l62j4j6j7j8Ciei 


-0.965 ±1 


930 


465 





9337 


-6.940 ± 2 


160 


465 


1.0082 


-0.965 ± 1 


930 


J120342+102831 


1 


89 


1 


3422 


a26i62jij6j7j8ei 


-3.210 ±1 


530 


459 





9903 


-2.000 ± 1 


440 


459 


0.9669 


-2.006 ± 1 


443 



Continued on next page. 



Quasar name ^em Zahs Transitions 



J120342+102831 


1 


89 


1 


5789 




71211 40+1 fl3nn2 


2 


19 


1 


0496 


n '^n-\ h'-i 1 A Ir If Irr lo 


T1 23200—022404 


1 


04 





7569 


UjZ^l ^2J4:Jb 


T1 23200— 022404 


1 


04 





8308 


fJ-2^1^2J AJQJ7 JS 


T123437+075S43 


2 


57 


1 


0201 


ft 1 A la 'J'vio 


.1123437+075843 


2 


57 


1 


7194 




T1 ^^^^^+1 fi4Q0^ 


2 


OS 


n 


7446 
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J234625+124743 


2 


58 


2 


1733 


Jiciei 


3.970 ± 7.720 


112 


1.2946 


4.230 ± 7.440 


112 


1.2768 


4.160 ±7.517 


J234625+124743 


2 


58 


2 


5718 


jieie2 


-16.700 ±6.930 


379 


1.4706 


-19.000 ±6.070 


379 


1.4764 


-17.274 ±6.799 


J234628+ 124858 


2 


52 


1 


1084 




-1.530 ±2.570 


64 


0.9661 


-1.560 ±2.340 


64 


1.0108 


-1.536 ±2.527 


J234628+124858 


2 


52 


1 


5899 


jejrei 


3.180 ±2.250 


93 


0.8769 


3.030 ±2.270 


93 


0.8377 


3.051 ± 2.268 


J234628+124858 


2 


52 


2 


1713 


h2jijAj&cidid2 


-0.823 ± 0.940 


273 


0.6703 


0.290 ±0.681 


285 


0.7865 


-0.794 ±0.951 


J235034-432559 


2 


88 


1 


7962 


bij6did2ei 


-0.400 ± 3.670 


182 


0.8125 


1.340 ±3.150 


182 


0.7991 


0.942 ± 3.357 
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4-5.1 Weighted mean for the VLT data 

We initially fit a weighted mean to our VLT points. The LTS method indicates that the 
z = 1.542 absorber toward J000448— 415728 is an outlier, with a residual of 4.2(j about 
the LTS fit, and so we remove this point. If we do not remove this point, the weighted 
mean after increasing errors is Aa/a = (0.154 it 0.132) x 10~^, with = 1-17. 

After removing this point, a weighted mean fit with our raw statistical errors yields 
Aa/a = (0.229 ± 0.095) x 10"^ with = 1-78. Applying the LTS method to this 
data set yields a random error estiiiicitc of (Tj-and — 

0.905 X 10-^ 

After accounting for this extra random error, the weighted mean becomes Aa/a = (0.208ib 
0.124) X 10"^, with = 0.99. This result differs from that of Murphy et al. (2004) at the 
~ 4.7(7 level. Although this appears to be a gross inconsistency, as will be seen below it is 
more likely that this reflects the fact that a weighted mean model is not a good description 
of the data set. 

4-5.1.1 Distribution of Aa/a values with redshift and validity of a weighted 
mean model 

In the bottom panel of figure 4.9 we show binned values of Aa/a plotted against redshift 
for the VLT sample. For z < 1.5, 3 of the 5 binned points fall in the region Aa/a < 0. 
For z > 1.5, 6 of 7 points in the binned plot fall in the region Aa/a > 0. This trend 
with redshift is different to that seen in fig. 6 of Murphy et al. (2004) for z < 1.6, all 7 
points fall in the region Aa/a < 0, whereas for z > 1.5 all 6 points also fall in the region 
Aa/a < 0. The apparent change in sign of Aa/a with z in the VLT sample suggests that 
a weighted mean model is not a good description of the VLT data. 

4-5.2 Dipole fit for the VLT data 

In this section, we fit the dipole model of equation 4.19 to the new VLT data. 

Inspection of the residuals about the fit, plotted as a function of redshift, reveals no ob- 
vious trend for higher scatter at higher redshifts and therefore we treat all absorbers the 
same in attempting to estimate Urand- We again identify the z = 1.542 system toward 
J000448— 415728 as an outlier, with a residual of 4.6fT about the LTS fit, even after in- 
creasing the error bars. Thus, we remove this system from our sample, and re-estimate 
o-rand = 0.905 X 10"^. We can this sample "VLT-dipole". 

Our dipole fit parameters after adding fXrand = 0.905 x 10^^ in quadrature to all error bars 
are: m = (-0.109±0.180) x 10-^ A = 1.18x10"^ (la confidence limits [0.80, 1.66] x 10"^), 
RA = (18.3 ± 1.2) hr and dec = (-62 ± 13)°. For this fit, = 141.8 and xl = 0-95. 
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Figure 4.7: Distribution of certain transitions in the VLT MM sample. The transitions 
shown are representative of those used at low redshifts (the Mg i/Mg il/Fe II combination, 
excluding the Fe II AA1608 and 1611 transitions), moderate redshifts (most transitions), 
and high redshifts (the Si ll/Fe II A1611, 1608/Al II combination, with the additional use 
of Zn II, Cr II and Ni ll). The vertical scales alternate. 



128 



4-5. Many-multiplet VLT results 



129 



Table 4.3: The frequency of occurrence for each MM transition in our VLT fits. Note 
that the Mgi A2026 transition is fairly weak compared to the Mgi A2852 transition, and 
so is included in few fits. Where we have included Mgi A2052 in our fit, and Znii A2026, 
is included in our fit, Mgi A2026 will also be modelled, although the contribution may 
be extremely minor. Nevertheless, we count this as an occurrence of Mgi A2026, as that 
transition is included in our model. The transition key provides a convenient, short-hand 
way of referring to a particular transition. This key is used in table 4.2. g-coefficients 
given are those used in our analysis. 



Transition 


q (cm ^) 


Key 


Frequency of occurrence 


Mgi A2026 


87 


ai 


3 


Mgi A2852 


86 


0-2 


53 


Mp-tt A27Qfi 


211 


hi 
"1 


88 


Mp-tt A28ns 


120 


bo 


86 


Alii A1670 


270 


Cl 


60 


Aim A1854 


464 


di 


25 


Aim A1862 


216 


d2 


25 


Sin A1526 


50 


ei 


57 


Sin A1808 


520 




31 


Cm A2056 


-1110 


hi 


21 


Cm A2062 


-1280 


h2 


15 


Cr II A2066 


-1360 


h3 


17 


Fell A1608 


-1300 


ji 


50 


Fell A1611 


1100 


32 


9 


Fe II A2260 


1435 


is 


12 


Fe II A2344 


1210 


J4 


97 


Fe II A2374 


1590 


J5 


51 


Fe II A2382 


1460 


k 


100 


Fell A2587 


1490 


h 


74 


Fe II A2600 


1330 




97 


Mnii A2576 


1420 


h 


13 


Mn II A2594 


1148 


i2 


9 


Mn II A2606 


986 


is 


9 


Nili A1709 


-20 


h 


22 


Niii A1741 


-1400 


k2 


24 


Niii A1751 


-700 


ks 


21 


Ti II A3067 


791 


91 





Ti II A3073 


677 


92 





Ti II A3230 


673 


93 





Ti II A3342 


541 


94 


1 


Ti II A3384 


396 


95 


1 


Znii A2026 


2479 


h 


9 


Znii A2062 


1584 


h 


13 
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Figure 4.8: Relationship between q coefficients and observed wavelength for all utilised 
transitions in all absorbers in the VLT sample. Although the low-z Fe ii/Mg ii combi- 
nation is sensitive to low-order wavelength distortions because the q coefficients for this 
combination are correlated with wavelength (see figure 4.2), one can see that for the full 
sample there is little correlation between observed wavelength and q, making the MM 
method resistant to systematics when many absorbers at different redshifts are used. 



To assess the dipole fit compared to a monopole-only (weighted mean) fit, we compare a 
weighted mean fit with errors adjusted according to the same Urand as used for the dipole 
fit, in order to ensure consistency of the data points used. As the weighted mean fit has 
= 149.8, the dipole fit yields a reduction in of 7.9 for an extra 3 degrees of freedom, 
when a reduction of ~ 3 would be expected by chance. Our bootstrap method yields 
a significance for the dipole-|-monopole model over the monopole-only model at the 97.1 
percent confidence level (2.19(t), indicating marginal evidence for the existence of a dipole 
when considering only the VLT data. We demonstrate this fit in figure 4.10. 

We also give the parameters for a dipole-only (no monopole) fit in table 4.4. 



4-5.2.1 Effect of the choice of the method-of-moments estimator 

In section 4-4.6 we suggested that a method-of-moments estimator was preferable in at- 
tempting to reconcile Aa/a values from turbulent and thermal fits. It is legitimate to ask 
whether our results differ if we simply choose that fit (turbulent or thermal) which has 
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Figure 4.9: Values of Aa/a for the VLT sample. The top panel shows all values of Aa/a 
with error bars incrcctsed. in quctdrature with (Trand 

= 0.905 X 10 ^. The middle panel shows 
the same data as the top panel, with with the vertical range restricted for better viewing 
of the higher statistical weight points. Both of the panels have been shaded according to 
a greyscale as the logarithm of the uncertainty estimate, with lower uncertainty points 
being darker. The bottom panel shows binned values of Aa/a where approximately 12 
points contribute to each bin. The bottom panel appears to demonstrate a change of sign 
for Aa/a with increasing z. In particular, for z < 1.5, 3 of the 5 binned points fall in the 
region Aa/a < 0. For z > 1.5, 6 of 7 points in the binned plot fall in the region Aa/a > 0. 
This trend with redshift is different to that seen in fig. 6 of Murphy ct al. (2004), for which 
all binned points fall in the region Aa/a < 0. This suggests that a weighted mean model 
is not a good description of the data. 
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50 100 150 

0, angle from best— fitting dipole (degrees) 

Figure 4.10: Binned values of Aa/a plotted against angle to the best-fitting dipole 
for the VLT sample. The red, solid line is the model Aa/a = Acos(O) -|- m, and the 
dashed, blue lines indicate the la uncertainty on the dipole fit. Statistical errors have 
been increased prior to binning as described in the text. The dipole-|-monopole model 
is preferred over monopole-only model at the 2.2a level. The parameters for this fit are: 
m = (-0.109 ± 0.180) X 10-^ A = 1.18 x 10"^ (la confidence limits [0.80, 1.66] x 10"^), 
RA = (18.3 ± 1.2) hr and dec. = (-62 ± 13)°. 

the lowest x^i instead of applying our method-of- moments estimator. The results for a 
VLT dipole model if we do this are: drand = 0.928 x 10"^ m = (-0.112 ± 0.184) x 10"^ 
A = 1.15 X 10~-' {la confidence limits [0.76,1.63] x 10"^), RA = (18.2 ± 1.2) hr, dec = 
(—62 lb 14)°. The dipole model is preferred over the monopole model at the 96 percent 
level (2.1cj). Thus our choice of the method-of-moments estimator does not change the 
results significantly, although fXrand is mildly larger if we simply choose those fits which 
have the lowest Xu- 

4-5.3 Summary of VLT results 

In this section, we have described the analysis of 154 new MM absorbers. The VLT sample 
appear to display a different trend of Aa/a with redshift to that seen in Murphy et al. 
(2004): in the VLT sample, Aa/a appears to grow more positive with increasing redshift, 
whereas fig. 6 of Murphy et al. seems to suggest that Aa/a becomes more negative with 
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increasing redshift. We showed that, in the VLT sample, an angular dipole model is 
preferred over a weighted mean model at the 2.2a level, which seems to suggest angular 
(and therefore spatial) variations in a. The direction of maximal increase in a is found 
to be RA = (18.3 ± 1.2) hr and dec. = (—62 it 13)° under a simple dipole model. We 
therefore explore the consistency of our Aa/a values and model parameters with those 
derived from the same models applied to the Keck sample, and a Keck + VLT sample, in 
the next section. 

We have shown that the VLT Aa/a values display excess scatter {Xu ^ 1) about the 
simple models described. This is likely due to both model mis-specification (from the use 
of simple weighted mean and angular dipole models) as well as unmodelled uncertainties. 
We described in section 4-4.3 a number of potential random effects which could give rise 
to excess scatter in the data, even if our model for Aa/a were correct. It is difficult to 
determine the contribution of each of these effects to the error budget, and so we have 
assumed that all absorbers are affected by the same processes, and therefore increased our 
error bars conservatively in quadrature with a cJi-and term. If the extra scatter in the Aa/a 
values is due to inaccuracies in modelling the velocity structure of the absorbers, it may 
be that observations at higher signal-to-noise ratios and higher resolving powers might 
help reduce the scatter. On the other hand, if the inter-component spacing is comparable 
to the intrinsic line widths then this may not be the case. 

We consider the specific effect of wavelength scale distortions on the VLT sample in sections 
5-2 and 5-4, and show there how such distortions can give rise to extra scatter in the data. 



4-6 Combination and comparison with previous Keck re- 
sults 

In table 4.4 we give the estimates of parameters and their associated uncertainties under 
various models fitted to the VLT, Keck and VLT -|- Keck Aa/a samples. The particular 
models and results are described in more detail in the following sections. 
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4-6.1 Previous Keck results 

Although the data of Murphy et al. (2004) do not demonstrate a statistically significant 
dipole, one can nevertheless calculate the location of a (non-significant) dipole in the data. 

We note briefly that Murphy et al. noticed significant wavelength calibration problems in 
the spectrum of Q2206-1958 (J220852-194359) from sample 3 of Murphy et al. (2003a) 
for A > 5000A of the order of ~ 5kms^^ at the time of that analysis. The two absorbers 
contributed by this spectrum were erroneously included in that paper, and so we remove 
them from the sample. 

Murphy et al. (2004) divide their sample into two portions, a high-contrast sample and a 
low-contrast sample. The high-contrast sample was defined by 27 absorbers where there 
were significant differences between the optical depth in the transitions used. Murphy 
et al. (2003a) give arguments as to why this might be expected to generate extra scatter 
in the Aa/a values. Due to the fact that many of the high redshift (z > 1.8) absorbers 
considered in Murphy et al. (2004) are associated with damped Lyman-a systems, this 
effect manifests itself as extra scatter in the Aa/a values about a weighted mean at high 
redshifts. For the VLT sample, we note that there is no evidence for excess scatter at 
higher redshifts compared to lower redshifts. 

We can examine the differences between the Keck and VLT samples in terms of the 
prevalence of weak species as follows. Firstly, define the following transitions as weak: Mg i 
A2026, Si II A1808, the Cr ii transitions, Fe ii AAA1608, 1611,2260, the Mn ii transitions, 
the Ni II transitions, the Ti ii transitions and the Zn ii transitions. From the table of 
frequency of occurrence in Murphy et al. (2003a), at z < 1.8 these transitions constitute 
about 3 percent of the total number of transitions used. On the other hand, in the VLT 
sample these transitions constitute about 13 percent of the sample used. The significantly 
greater prevalence of these weak transitions at low redshifts in the VLT sample may explain 
the lack of evidence for differential scatter between high and low redshifts. Effectively, the 
greater prevalence of weak species in the low-z VLT sample may increase the scatter at low 
redshifts in that sample, making any low-z/high-z difference appear smaller. We retain 
the high/low contrast distinction when analysing the Keck sample. 

4-6.1.1 LTS method applied to the Murphy et al. (2004) results 

If we apply the LTS method to the high-contrast sample to estimate the extra error needed 
about a dipole model, we find that an extra error term of (Trand — 

1.630 X 10-5 is needed. 

With this extra term, Xu — 1-13, indicating that the distribution is mildly leptokurtic 
(fat-tailed). The low-contrast sample data are already consistent under a dipole model 
with the LTS method (cJrand = 0). 
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We then combine the high-contrast Aa/a values (with error bars increased) with the low- 
contrast Aa/a values to form a new sample under a dipole model (equation 4.19). The LTS 
method applied to this set reveals that the Aa/a values are consistent about dipole model. 
Additionally, Xu = 1-04. Nevertheless, we identify one possible outlier from this set: the 
absorber with z ^ 2.84 towards Q1946-h7658, with Aa/a = (-4.959 ± 1.334) x 10-^ and 
remove this absorber from the sample. This point has a residual of —3.6a about the LTS 
fit. We refer to this sample as "Keck04-dipole". 

A dipole fitted to this sample yields RA = (16.0 ± 2.7) hr, dec = (-47 ± 29)°, and 
A = 0.41 X 10"^. la confidence limits on A are [0.29,0.78] x 10"^. The monopole is 
m = (—0.465 lb 0.145) x 10~^. This fit has Xu ~ 0.96. The dipole model is preferred over 
the weighted mean model at the 36 percent confidence level (0.47cr). 

The monopole offset appears to be significant at the 3.2cr confidence level, but this is related 
to the fact that the Keck results alone do not clearly support a dipole interpretation. 

For dipole model with no monopole {Aa/a = ^cosO), the fitted parameters are A = 
1.06 X 10"^ {la confidence limits [0.82,1.34] x 10"^), RA = (-16.4 ± 1.2) hr, dec. = 
(—56 lb 12)°. This model is significant at the 72 percent confidence level (l.lo"). 

4-6.2 Combined weighted mean 

We create a combined weighted mean fit by combining the VLT-dipole sample with the 
Keck04-dipole sample. The VLT sample has had errors increased in quadrature with 
o"rand = 0.905 X 10~^, whereas the Keck high-contrast sample has had errors increased in 
quadrature with cXrand = 

1.743 X 10"^. The same points identified as outliers have been 

removed. 

This leads to a weighted mean of {Aa/a)^, = (-0.216 ib 0.086) x 10~^ with xl = 1-03. 
However, a weighted mean model does not appear to adequately capture all the information 
in the data (see figure 4.11). Comparing the weighted mean of the z > 1.6 points for both 
samples yields a simple demonstration of the north/south difference. For the VLT sample, 
Aa/ayj{z > 1.6) = (0.533ib0.172) x 10"^, whereas for the Keck sample Aa/aw{z > 1.6) = 
(-0.603 lb 0.224) X 10"^ The difference between these weighted means is 4o". 

4-6.3 Combined dipole fit 

To create our combined dipole fit, we combine the VLT-dipole sample with the Keck04- 
dipole sample to create the "combined dipole" sample, our main sample. This sample 
consists of 293 MM absorbers. Importantly, both of these sets exhibit no |rj| > 3 residuals, 
and thus a combined fit is unlikely to exhibit any large residuals provided that both data 
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sets are well described by the same model. If the data sets are inconsistent, one might 
expect large-residual points to emerge. 

For an angular dipole fit to these Aa/a values (Aa/a = ylcosG -|- m), we find that 
m = (-0.178 ± 0.084) x 10~^ A = 0.97 x 10-^ RA = (17.3 ± 1.0) hr, dec = (-61 ± 10)°, 
with = 280.6 and xl = 0.97. la confidence limits on A are [0.77,1.19] x 10"^. A 
weighted mean fit to the same Aa/a values and uncertainties yields x^ = 303.8, and so 
a dipole model yields a reduction in x^ of 23.2 for an extra 3 free parameters. With our 
bootstrap method, we find that the dipole model is preferred over the weighted mean fit 
at the 99.995 percent confidence level (4.06(t), thus yielding significant evidence for the 
existence of angular variations in a. Using the method of Cooke & Lynden-Bell (2010), 
the significance of the dipole is found to be 4.07(7. 

Importantly, the combination of the Keck04-dipole Aa/a values with the VLT-dipole 
Aa/a values yields ~ 1 about a dipole model. If inter-telescope systematics were 
present, we would expect the combination of the Keck and VLT data to yield a Xu t^^t is 
significantly greater than unity under the dipole model, despite xt being ~ 1 when that 
model is fitted to the samples individually. Thus, there is no significant evidence based 
on x^ that inter-telescope systematics are present. 

We show in figure 4.12 the values of Aa/a for both Keck and VLT against the best-fitting 
dipole model. We give binned values there, which yields a visual demonstration of the 
dipole effect. We also give there a plot of the standardised residuals about the fit, which 
demonstrates that the fit is statistically reasonable. We also show binned values of Aa/a 
for the Keck, VLT and combined samples in figure 4.11. We show an unbinned version of 
these data for |Aa/a| < 5 x 10^^ in figure 4.13. 

For a model with no monopole {Aa/a = AcosO), the fitted parameters are A = 1.02 x 
10"^ {la confidence limits [0.83, 1.24] x 10"^), RA = (17.4±0.9) x 10-^ dec. = (-58±9)°. 
This model is significant at the 99.996 percent level (4.14cr). 

In figures 4.14 and 4.15, we show the confidence limits on the dipole location for the 
Keck, VLT and combined samples. The individual symbols illustrate the weighted mean 
of Aa / a along each sightline under the models Aa/a = A cos @ and Aq /a = A cos Q + m 
respectively. 

There are several significant points to consider from these results: 

1. The dipole is statistically significant. Even after accounting for random errors in 
a conservative fashion, the statistical significance of the dipole is greater than 4a. 
This is strong statistical evidence for angular and therefore spatial variation in a. 

2. Dipole models fitted to the Keck and VLT Aa/a values yield consistent estimates for 
the pole direction. This is important, and would be very surprising if one assumes 
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Figure 4.11: Binned values of Aa/a by redshift in the VLT-dipole sample (bottom panel, 
circles), the Keck04-dipole sample (middle panel, squares) and the combination of the two 
(top panel, triangles). The value of Aa/a for each bin is calculated as the weighted mean 
of the values of Aa/a from the contributing absorbers. The statistical errors for certain 
points have been increased prior to binning, as described in the text. Note that for z > 1.5, 
the Keck data generally indicate Aa/a < 0, whereas the VLT data indicate Aa/a > 0. 
As Keck is located in the northern hemisphere, and VLT is in the south, this is a rough 
visual demonstration of the dipole effect. However, given the overlap between the samples, 
the proper procedure is to directly fit a dipole (see figure 4.12). Interestingly, both the 
VLT and Keck data seem to to support Aa/a < for z < 1.5. This effect is considered 
in more detail in section 4-6.8. 138 
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Figure 4.12: The top panel shows the combined results for the Keck and VLT samples, 
plotting Aa/a against angle from the fitted dipole location for the combination of the 
Keck and VLT Aa/a values, binned together. The middle panel shows the data from 
different telescopes, with Keck as squares and VLT as circles. Points in the top panel 
contain approximately 25 absorbers per bin, whereas points in the middle panel contain 
approximately 12. The model shown (red, solid line) is Aa/a = 74cos(0) + m. The 
parameters for this model are: m = (-0.178±0.084)xl0~^ A = 0.97x10"^ (Ifi confidence 
limits [0.77,1.19] x IQ-^), RA = (17.3 ± 1.0) hr, dec. = (-61 ± 10)°. The dashed, blue 
lines indicate the la uncertainty on the fit, including the uncertainty in determining the 
position of the dipole. In both the top and middle panels, the dotted horizontal line 
indicates the monopole value. The vertical dotted line shows 90°. The bottom panel 
indicates the standardised residuals (rj = [data — model] /error) about the best fit. The 
presence of no points with |rj| > 3 indicates that the fit is not being dominated by a small 
number of large residual points. 
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Figure 4.13: Aa/a against the angle from the fitted dipole location under the model 
Aa/a = j4cos(0) + m for the VLT and Keck Aa/a values. In contrast to figure 4.12, 
these data are not binned. Blue circles are VLT absorbers and pink squares are Keck 
absorbers. Error bars have been omitted; instead, larger symbols indicate Aa/a values 
with greater statistical weight, according to the key provided. The precision includes 
the effect of (Jrand- The dipole trend is visible as the presence of more and larger points 
in the upper left and lower right quadrants. The visual cluster of points at < 47° 
is due to 4 quasars which contribute 14 values of Aa/a (2 points not shown because 
they lie beyond the vertical range of the plot). One can investigate the consistency of 
the VLT and Keck Aa/a values in the region near the dipole equator (defined here as 
80° < < 100°) by comparing the weighted mean of the Aa/a values. In this case, 
Aa/a«,(VLT) — Aa/a^(Keck) = (0.32 it 0.19) x 10~^, giving no significant evidence for a 
difference between the two samples. In this region, the VLT sample contributes 39 points 
and the Keck sample contributes 43 points. The difference here is calculated so as to 
include the effect of Urand- 140 
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that a dipole effect is not present. If two different systematic effects were operating 
in each telescope so as to produce a trend in Aa/a, then: a) it is unlikely that 
these effects would be correlated with sky position, and b) even if systematic effects 
existed in both telescopes which were correlated with sky position, it is very unlikely 
that such effects would occur in such a way as to yield very consistent estimates 
of the dipole position between the two telescopes, with a similar amplitude, par- 
ticularly when the two telescopes are independently constructed and separated by 
~ 45° in latitude. Any attempt to ascribe the observed variation in a to systematics 
must account for the good alignment of the dipole vectors from dipole models fitted 
independently to the Keck and VLT samples. Note that telescope or instrumen- 
tal systematics which depend only on wavelength cannot produce observed angular 
variation in a for a sufficiently large sample of absorbers. 

3. The VLT and Keck /S.a/a values appear consistent near the equatorial region of the 
dipole. From the middle panel of figure 4.12, both the VLT and Keck results show 
large variation from Aa/a = near the pole (0 = 0°) and anti-pole (0 = 180°) 
of the dipole, but show much less variation in the equatorial region (0 = 90°). So, 
at least visually, the Keck and VLT points are not inconsistent in the region where 
they overlap. This issue is addressed quantitatively in the caption to figure 4.13 

4. The dipole effect is not being caused by large residual points. The bottom panel of 
figure 4.12 clearly shows that there are no |rj| > 3ct points present. 

4-6.3.1 Bayesian evidence 

In the Bayesian paradigm, a quantity of fundamental interest for model selection is the 
Bayesian evidence. For some model Mj, data set D and vector of parameters Xj (of 
dimension pi) the evidence is given by 



Suppose a competing model for the same data set has parameters Xj. For the evidence 
in favour of a dipole -|- monopole model (described by the 4 parameters x^^) against a 
monopole-only model (described by the parameter x^), the Bayes factor determines the 
evidence in favour of one model over the other, namely 



The evidence is computationally difficult to evaluate, especially for high numbers of di- 
mensions — the integration must generally be carried out through Monte Carlo means, 
and naive Monte Carlo integration degrades exponentially with increasing dimensionality. 
One option is to assume that the posterior PDF is approximately Gaussian. For our data. 




(4.36) 



B = 



/Pr(D|x^)Pr(xrf)dxrf 
/Pr(D| 

^m) Pl^(Xm ) dXn 



(4.37) 
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this should be at least approximately true on account of the central limit theorem, as 
we have 293 points in our main sample. Although we noted that {A, RA, dec) are not 
normally distributed, {cx,Cy,Cz) should be. 



By approximating the posterior probability as a Gaussian, one obtains 

1 



Pr(xi|D) oc Pr(xi|D)exp 



- (Xj Xj 



X,; - Xi 



(4.38) 



where Xj is the best estimate of the parameters and Cj is the covariance matrix at the 
best-fitting solution. This leads (Hobson et al., 2002) to the approximation 



Pr(D|MO ^ (27r)P'/2|c,|V2pr(^.)pr(D|xi,M0, 



(4.39) 



where pi is the number of data points. This is known as the Laplace approximation. 
This expression requires that Pr(xj), the prior for the parameters, and Pr(D|xj,Mj), the 
likelihood function for the fit, are appropriately normalised, such that /Pr(xj)dxj = 1 
and /Pr(D|xi)dD = 1. 



The likelihood for the data points is given by 

N 



Pr(D|x„Mi) = n 



1 



j=i o-jV27r 



exp 



2a] 



(4.40) 



where yj is the jth value of Aa/a, dj is the associated uncertainty and fi{xi)j is the 
model prediction under the ith model. We can drop certain terms in here, because when 
comparing two models we consider the same data set (i.e. the aj are common). Thus we 
can use instead 



N 



Pr'(D|xi,Mi) = n^xp 



2a] 





X?(xi) 


= exp 


2 



(4.41) 



where we write Xi (xj) to indicate that the model is evaluated at some value of the param- 
eters, not necessarily the maximum likelihood estimate. 

The only issue left is to evaluate the prior, Pr(xi). Unfortunately, the estimation of the 
evidence is sensitive to the choice of priors. Firstly, note that we are comparing a dipole 
-|- monopole model to a monopole-only model. If we assume the same uniform prior for 
the monopole in both samples, it will be a common factor in the evidence for both models 
and therefore will cancel. Thus we need not choose any particular range for the prior on 
the monopole. However, we must choose a prior on the dipole components. Here, it is 
more convenient to work in spherical coordinates, where the dipole is naturally expressed. 
Firstly, we assume a separable prior, so we can write 



Pr(^, , 



Pr(^)Pr((/),6l). 



(4.42) 
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The obvious choice for Pr((/>, 9) is one which gives no preference to any particular angle, so 
that there is no preference specified for the dipole direction. The necessary prior can be 
derived from the symmetry argument that the probability of a point being in a particular 
region is proportional to the region's angular area. That is, 

ciPr = ^ = ^ded</>, (4.43) 
47r 477 ^ 

where the factor of 47r is chosen to give the correct normalisation. Thus, the prior is simply 

sin n 

Pr(0,0) = ^. (4.44) 

We have to choose a realistic prior for Pr(^); an unreasonably broad choice of prior will 
cause a model with more parameters to always be disfavoured (Silvia & Skihing, 2006). 
An ideal choice would be Pr(A) oc 1/A (the Jeffreys' prior), however this prior cannot be 
normalised. A heuristic choice is 

Pt{A) = Ce-^/^ (4.45) 

for some initial scale estimate k (Silvia & Skilhng, 2006). This gives a preference to small 
amplitudes, which is what we naturally expect. Thus, the prior required is 

Pr(^, 0, e) = ^e-^'^ sin 6. (4.46) 
47r 

To ensure that Pr(A, </>, 0) is properly normalised, we need 

/ / / A^smedOd(l)dA = l, (4.47) 

Ja=o J(/>=o Je=o 47r 

which means that 

C=^. (4.48) 



If we define 



then note that 



^(^)=/ / / —^^A^ded4>dA, (4.49) 
Ja=o J 6=0 Je=o onk-^ 



F{k) ^ 0.08, 
F{2k) w 0.32, and 
F{3k) w 0.58. 

Thus, most of the probability volume is located at A > k. We think that a choice of 
k = 1 X 10"^ for the dipole amplitude as a prior is not too controversial. This means 
that there is a 99.7 percent chance that the dipole amplitude is less than 10"*^, with 
other probabilities as given above. In our case, A = 0.97 x 10~^ and 9 = 151°. Thus, 
Pr(x£,) = exp(-0.97)sin(151°)/[87r(10-5)3], again neglecting the monopole prior because 
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it is common to the evidence for the monopole. 

With these assumptions, we calculate B = 49.7. That is, the dipole + monopole model is 
preferred to the monopole model at the 98 percent level. Converting this to the Jeffreys' 
scale (Jeffreys, 1961) requires us to consider 2lnB = 7.8. On the Jeffreys' scale (Jeffreys, 
1961), this is considered strong evidence in favour of the dipole + monopole model over 
the monopole-only model. 

4-6.4 Potential effect of differences in atomic data and q coefficients 

If the atomic data or q coefficients we used were significantly different to those used by 
Murphy et al. (2004), this could spuriously create differences in Aa/a between VLT and 
Keck. This has the potential to mimic spatial variation in a. To check the influence 
of this, we re-fit the VLT spectra using the same atomic data used by Murphy et al., 
and then combine the Aa/a values with the Keck values. Where we use transitions that 
were not available to Murphy et al. (e.g. Mn ii and Ti ii) we make no modification 
to the atomic data or q coefficients. The frequency of occurrence of these transitions 
in the sample is small and therefore this is of little consequence. When we proceed in 
this way, the parameters for the model Aa/a = AcosQ + m are: A = 0.97 x 10^^, 
RA = (17.5 ± 1.0) hr, dec. = (-60 ± 10)° and m = (-0.168 ± 0.084) x 10"^ The 
significance of the dipole+monopole model with respect to the monopole-only model is 
4.15(T. We conclude that the impact of any variations between atomic data or the q 
coefficients used for our fits and those used by Murphy et al. (2004) is negligible. 

4-6.5 Alignment by chance between Keck and VLT 

One can pose the question: "Given the distribution of sightlines and values of Aa/a in 
each sample, what is the probability of observing alignment as good or better than that 
observed between the Keck and VLT samples by chance?" To assess this, we undertake 
a bootstrap analysis, where at each bootstrap iteration we randomly reassign the values 
of Aa/a in both the Keck and VLT samples to different sightlines within those samples, 
keeping the redshifts of the absorbers fixed. That is, we do not mix the two samples. 
We then calculate the best-fitting dipole vectors for each sample, and calculate the angle 
between them. We then assess over many iterations in what percentage of cases is the 
fitted angle smaller than the angle for our actual data. 

For our actual Keck and VLT samples, the angle between the fitted dipole vectors is 24 
degrees, and the chance probability is 6 percent. We show the results of this bootstrap 
analysis in figure 4.16. Thus, it seems unlikely that inter-telescope systematics are respon- 
sible for the observed effect. The good consistency between the results also qualitatively 
supports the notion that the measured effect is real. 



146 



4-6. Combination and comparison with previous Keck results 



147 



O 

O 

Q I 1 r-| — I 1 1 1 1 1 1 1 1 1 1 1 1 r 




50 100 150 

Angle between dipole vectors (degrees) 

Figure 4.16: Results of the bootstrap analysis described in section 4-6.5 to assess the 
probability of obtaining alignment of the dipole vectors between the VLT and Keck samples 
as good as we have observed by chance. The vertical red line shows the angle between the 
Keck and VLT dipole vectors (24 degrees). The area to the left of the red line indicates 
the probability of interest, namely 6 percent. 

4-6.6 Low-2; vs high-2; sample cuts 

We divide our sample into low-z and high-z absorbers to examine the contribution of 
the different redshifts to the dipole detection. Although there is no clear delineation 
between which transitions are fitted for a given redshift, we can generally say that the 
low- 2; sample is dominated by the Mg/Fe combination, that intermediate redshifts display 
a wide range of transitions, and that high redshift systems are dominated by the Si 11/ Al 
ii/Fe II A1608 combination with Cr ii/Zn ii/Ni 11. In particular, Mg II, Mg i A2852 and 
the Fe 11 transitions with Aq ^ 2200 are not generally used when fitting absorbers at high 
z because they are either beyond the red cut-off in the observed spectral range, or the 
transitions are affected by sky absorption or emission. 

If the observed dipole effect was caused by chance or by a systematic effect which affects 
some combination of transitions, then we would not expect dipole fits to absorbers from 
high and low redshift to yield the same location on the sky. Conversely, if dipole models 
fitted to high and low redshift samples point in a similar direction, this lends support to 
the dipole interpretation of the data. 

We cut the data into a z < 1.6 sample (low-z) and a 2: > 1.6 sample (high-z). This divides 
the data approximately in half, with 148 points in the low-z sample and 145 in the high-z 
sample. We show in figure 4.17 the confidence limits on the dipole directions from separate 
fits to the low-z and high-z samples, and demonstrate that they yield consistent estimates 
of the dipole location. We give the parameters to the model Aa/a = Acos(G) -|- m in 
table 4.5. In particular, the dipole vectors are separated by 13 degrees on the sky. 
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Given the distribution of Aa/a values and sightlines in each sample, the probability of 
obtaining alignment this good or better by chance is 2 percent. Given that the transitions 
used at low and high redshift are significantly different (and the relationship between the 
q coefficients and wavelength is significantly different for the transitions used at low and 
high redshift), this consistency further supports the dipole interpretation of the data. It 
is also clear that the dipole signal is significantly larger at high redshift, although the low 
redshift sample contributes. 

There is no significant evidence for a high-z monopole, but the low-z monopole is significant 
at the 3.6a level. We discuss the significance of the low-z monopole in section 4-6.8. 

Table 4.5: Parameters for the model Aa/a = Acos{@) + m for z < 1.6 and z > 1.6 
samples. The column ^^5A" gives la confidence limits on A. The column labelled "sig" 
gives the significance of the dipole model with respect to the monopole model. Although 
it is clear that most of the significance comes from the z > 1.6 sample, the z < 1.6 sample 
also contributes. Additionally, a dipole model for the z < 1.6 sample points in a similar 
direction to that of the z > 1.6 sample. 



Sample 


A (10-^) 


SA (10-^) 


RA (hr) 


dec (°) 


m {IQ-'') 


sig 


z < 1.6 
z > 1.6 


0.56 
1.38 


[0.38,0.85] 
[1.12,1.74] 


(18.1 ± 1.8) 
(16.5 ± 1.4) 


(-57 ±22) 
(-63 ±11) 


(-0.390 ±0.108) 
(0.097 ±0.138) 


1.4o- 
3.5o- 




Right Ascension (hours) 



Figure 4.17: Sky map in equatorial (J2000) coordinates showing the 68.3 percent 
(la equivalent) confidence limits of the location of the pole of the dipole fitted to the 
z < 1.6 combined sample (green region), z > 1.6 combined sample (blue region) and 
combined sample (red region) under the model Aa/a = Acos(0) ± m. The location of 
the CMB dipole and antipole are marked as Pcmb and Acmb respectively for comparison 
(Linewcavcr, 1997). This figure demonstrates that the low-z and high-z absorbers produce 
consistent estimates of the dipole location, despite generally using significantly different 
combinations of transitions. The dipole vectors for the z < 1.6 and z > 1.6 sample are 
separated by 13 degrees. The probability of getting alignment this good or better by 
chance is 2 percent. 
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4-6.7 Joint probability 

The probabihty of obtaining ahgnment between the dipole vectors from dipole models 
fitted to the Keck and VLT samples separately as good or better than is seen by chance is 
about 6 percent. The chance probability of obtaining alignment between the dipole vectors 
from dipole models fitted to the low- and high-redshift samples is about 2 percent. Through 
a bootstrap method we have calculated the joint probability of obtaining alignment that 
is at least as good as seen for both of these conditions by chance, and it is ~ 0.1 percent. 

It is possible to conjecture that the Keck results are somehow erroneous, with Aa/a 
values shifted to be more negative on average through some unknown systematic. The 
VLT results then show no overall statistically significant monopole variation, and only 
a marginal (~ 2.2a) angular variation. In this case, it would then appear that there is 
no statistically significant variation of a. However, in this case one is still left with the 
~ 0.1 percent chance probability above, which is equivalent to ~ 3.3cj. This would be 
a large and intriguing coincidence, but we agree that 3.3it is not overwhelmingly large. 
Ultimately, we cannot exclude the possibility that the results presented here which seem 
to indicate spatial variation of a are due to chance (with or without the influence of an 
unknown systematic) but the joint chance probability of 0.1 percent described here seems 
to suggest that this is unlikely. We discuss potential systematic errors in chapter 5. 

4-6.8 Significance of the monopole 

In section 4-6.6 we noted that the low-z sample shows evidence for a statistically significant 
monopole at the 3.6ct level. In figure 4.11, the existence of the monopole in both samples 
can be seen at low z. Note in particular the top panel, where the trend of Aa/a is toward 
negative Aa/a for z < 1.6. 

An obvious question is whether the monopole arises from one of the Keck or VLT samples. 
For a model Aa/a = Acos(0) -|- m, the Keck sample yields a z < 1.6 monopole of m = 
(-0.404±0.171) X 10~^ which differs from zero at the 2.4a level. However, the same model 
fitted to the VLT z < 1.6 Aa/a values yields m = (-0.373 ± 0.295) x lO'^. This differs 
from zero at the 1.3a level. There are three important considerations from these values: 
i) Both data sets yield very consistent monopole values for Aa/a at low redshift; the 
monopole values differ at the 0.09a level. Therefore, whatever is generating the monopole 
appears to affect both the Keck and VLT samples, ii) Because there is no significant 
difference between the monopole values in the Keck and VLT samples, the monopole 
cannot be responsible for mimicing angular variation in a. Hi) Additionally, most of the 
dipole signal originates at z > 1.6 (where the significance of the dipole-|-monopole model 
over the monopole-only model is 3.5a). As such, the presence of a low-z monopole does 
not affect the redshifts where most of the dipole significance originates. 

There are several possible explanations for this, and we discuss each of them in turn: 
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1. Errors in the laboratory wavelengths. Errors in the laboratory wavelengths of tran- 
sitions which feature predominantly at low redshifts could cause a statistically sig- 
nificant monopole at low redshifts. However, this seems particularly unlikely. The 
Mg I /li wavelengths have been accurately measured on an absolute scale generated 
using a frequency-comb calibration system. The Fe il wavelengths used at z < 1.6 
have also been precisely measured (the A1608, 1611 transitions are more difficult 
to measure accurately, but these transitions are used infrequently at low redshifts 
due to their short rest wavelengths). For instance, the absolute velocity uncertainty 
in the Fe ii A2382 transition is ~ 14 ms^"*^, which is significantly smaller than the 
~ 82ms~^ which would be needed to generate a monopole value of —0.39 x 10~^. 
This implies a systematic error some six times larger than the existing error budget, 
which seems unlikely. Additionally, the relative wavelength scales of the different 
experiments which measured the transitions used at lower redshifts are likely to be 
significantly better than this. 

2. Time evolution of a. The functional form for variation of a (if a varies) is unknown. 
Recent observations confirm the apparent acceleration of the universe at late times 
(Astier et al., 2006), for which dark energy is posited as an explanation. For z < 0.5, 
dark energy dominates over matter and radiation (Riess et al., 2004). If a couples to 
dark energy, then late-time evolution of a might be possible. Monotonic evolution 
of a cannot by itself be an explanation for a low-z monopole, because this would 
imply that Aa/a should approach zero for z — )■ 0, with the greatest divergence of 
Aa/a from zero at high redshift. If a oscillates with time then a pattern such as is 
seen could arise. However, this would require the period of oscillations to be ~ twice 
the age of the universe, with the present day at a node of the oscillation, in order 
to obtain {Aa/az>i.e) ~ 0, (Aa/ao.2<2<i.6) ~ —0.4 x 10^^ and Aa/az=o = 0. This 
may be possible, but this case seems rather contrived. 

3. Dependence of a on the local environment. If the value of a depends on the local 
environment (e.g. matter density, gravitational potential, or gradient of the gravita- 
tional potential) then this could produce an offset between the value of a measured 
in the quasar absorbers and the value measured in the laboratory, even as z — )■ 0. If 
this was the case, we would expect a similar magnitude monopole to also be present 
at high redshift, which is not seen. 

4. Telescope systematics. Wavelength-dependent telescope systematics seem difficult 
to support given the inter-telescope consistency. 

5. Significantly different abundances of isotopes in the absorbers. The isotopic splitting 
scales as Aui oc uo/mf, where nii is the mass of the species under consideration. 
Mg is the lightest atom used in the MM method, and therefore the isotopic splitting 
for the Mg transitions is relatively large. If the abundance of the three Mg isotopes 
differs significantly in the quasar absorbers to terrestrial values, this would mimic 
a change in a. The low- 2: sample is dominated by the Mg ii/Fe 11 combination. 
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which is particularly sensitive to the effect of differences in the abundance of the Mg 
isotopes (Murphy et al., 2001a). 

It is possible that a combination of the time evolution of a and dependence of a on the local 
environment could explain the low-^; monopole, but this requires two different mechanisms. 
Additionally, in this circumstance the magnitude of the environmental dependence must 
be very similar to the magnitude of the time evolution from z ~ 4 to z = in order to 
obtain the observed distribution of Aa/a with z, requiring significant fine-tuning. 

On balance, evolution in the abundance of the Mg isotopes seems like the most likely of 
these explanations. We explore the effect of differences in the relative abundances for Mg 
isotopes between terrestrial values and those in the quasar absorbers in section 5-6. 

The lack of a clear explanation for the low-z monopole is a weakness of the results presented 
here. Specifically targeted future observations at sufficiently high resolving powers and 
signal-to-noise ratios may be able to resolve the isotopic shifts for the magnesium lines (or 
otherwise), thus directly determining whether the above explanation is correct. It would 
be particularly interesting to map out the angle-independent variation in a as a function of 
redshift; this would require many Aa/a measurements at all angles, binned into redshift 
slices. Similarly, it would be useful to demonstrate whether or not a Aa/a monopole was 
present at low redshifts by using transitions other than magnesium — discovery of a low-z 
monopole in this case would suggest evolution in a (or perhaps some other systematic), 
whilst failure to detect the monopole would imply that evolution in the abundance of the 
magnesium isotopes was responsible. 

4-6.9 Iterative clipping of potentially outlying Aa/a values 

We have attempted to be conservative in presenting our results when accounting for extra 
scatter in the Aa/a values about a model by adding a term, cJrandi in quadrature with the 
error bars. This effectively functions as an interpolation between a fit where the error 
bars are believed to be correct and an unweighted fit, where the error bars are unknown. 

However, another option is to assume that the statistical error bars for most Aa/a values 
are a good representation of the total uncertainty for those absorbers, and then remove 
points one- by-one ("clipping") until xt — ^- o^i' sample it is difficult to determine 
to what extent different random processes affect different absorbers, and therefore to 
determine to what extent clipping is justified. Adding some (Trand in quadrature with all 
Aa/a values, as we have done, is a conservative option. Nevertheless, we explore the effect 
of data clipping here to investigate the robustness of our results to the removal of Aa / a 
values. 

Traditionally, data clipping involves iteratively removing the point with the largest residual 
and then re-fitting. However, for the reasons given in section 4-4.8.4, this has the potential 
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to incorrectly remove points. Therefore, we use a modified method. At each iteration, 
we calculate the LTS fit using the model Aa /a = A cos + m to the Aa / a values with 
their raw statistical errors, and then remove the point with the largest residual. However, 
we choose A: = n — 1 in this case. Effectively, at each stage, we want to identify only 
one point to remove, and therefore it makes sense to calculate a fit to n — 1 points. 
We then calculate a weighted fit, using only the statistical error bars, and calculate the 
significance of the dipole model. For efficiency of calculation, we avoid bootstrapping, and 
so we use the method of Cooke & Lynden-Bell (2010) to calculate the significance of the 
dipole +monopole fit with respect to the monopole fit. We then repeat the process. At 
any iteration, if > we multiply all entries of the covariance matrix by Xu order 
to account for excess scatter about the model. If < we do not adjust the covariance 
matrix. 

Initially, one expects the significance of the fit to improve, as one discards a few points 
which are not consistent with the general trend of the fit. Eventually, one will remove 
enough Aa/a values that the significance must decline. If the significance declines rapidly, 
this implies that the dipole effect is dominated by a few points. Conversely, if the signifi- 
cance of the fit is sustained or improved for the removal of small fractions of the data (e.g. 
~ 10 percent), this qualitatively implies robustness of the result. 

We show in figure 4.18 the results of this process. We find that we must remove large num- 
bers of absorbers to destroy the significance of the dipole. In particular, the significance 
does not decrease rapidly with the number of Aa/a values clipped initially, suggesting 
that the observed dipole effect is not being caused by a few outlying points. If we clip 
until xt = 1^ the significance of the dipole is almost 7fT. 

In figure 4.19 we show the effect of clipping Aa/a values on the location of the dipole. 
One expects that if the dipole effect is real, then the position of the dipole should not 
change dramatically with the removal of small amounts of data (that is, AO = Oj — Oq 
should be small). To assess how likely it is that this seemingly restricted path is typical 
for our distribution of data, we apply a bootstrap method to generate and iteratively trim 
300 new samples, and examine the distribution of AO at each point. We cannot use a 
traditional bootstrap, which resamples the data with replacement, because how the data 
is trimmed depends crucially on the distribution of residuals. Therefore, we resample the 
residuals of the fit to generate new samples. To do this, we use the following process 
to generate one sample: i) calculate the model prediction for each absorber given the 
model. Pi = ylcos(O) -|- m; ii) calculate the residuals about the fit for each absorber, 
Tj = {Aa/ui — Pi)/<Ji] Hi) randomly reassign the calculated to different absorbers, 
generating r^-; iv) generate a new set of Aa/a values as Aa/a'j = pj + r'^aj. In this 
way, we generate new values of Aa/a which represent different possible realisations of 
our sample where the actual distribution of residuals is preserved. This is demonstrated 
in figure 4.19. We see that the bootstrapped samples do not wander very far even when 
much of the data is removed (AO < 20°). 
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To contrast this with the effect on a random sample, we also show in figure 4.19 the effect 
of trimming random samples. To do this, we generate 300 new samples by randomly 
reassigning values of Aa/a to different sightlines, and iteratively trimming under the 
model Aa/a = ^cos(G) + m. We see here that our actual sample is not typical of the 
random samples, therefore suggesting that the actual sample is significantly dissimilar to 
random samples. 




20 40 60 80 100 

Fraction of data trimmed (%) 

Figure 4.18: Effect of iteratively clipping the data on the statistical significance of the 
dipole model for the combined sample, as described in section 4-6.9. The vertical axis 
shows the statistical significance of the dipole as determined by the method of Cooke & 
Lynden-Bell (2010) given in terms of a (solid line) and Xu that point (blue, dotted 
line). A dashed horizontal line is drawn at 3a for reference. The vertical red (dashed) line 
indicates the point at which our clipping method reduces xt to below unity. We note that 
we have to remove more than 40 percent of data before the significance of the detection 
drops to about 3fT. As there is no good reason to remove so much data, this implies that 
our result is robust. The actual significance given here is probably overstated compared 
to the "true" significance, given that no attempt has been made to account for systematic 
errors. 

4-6.10 Removal of spectra 

A further question one might ask is how sensitive our results are to the inclusion of 
particular spectra. We would like to know whether the dipole result could be dominated 
by a small number of spectra which, if removed, would destroy the result. 
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Figure 4.19: Effect of iteratively clipping the Aa/a values on the location of the dipole, 
as described in section 4-6.9. The vertical axis shows the deviation of the fitted angle from 
the untrimmed model (A0) as a function of the percentage of absorbers removed. This 
figure compares the results of trimming on our actual Aa/a values, bootstrapped samples 
designed to emulate our data, and random samples. Actual data: The solid black line 
shows the results of trimming for our Aa/a values. If the fit is stable and not due to the 
presence of a small number of highly significant points, we expect to see that AG should 
not grow rapidly with the amount of data removed. This is what is seen. Bootstrapped 
samples: The dashed lines show the la range for 300 bootstrap samples (generated as 
described in the text). This shows the typical range of variation at fraction of absorbers 
removed given distribution of sightlines, values of Aa/a, statistical errors and distribution 
of residuals in the sample. The region given reflects the la range for the bootstrapped 
samples at each point; each individual sample may wander substantially more than is 
indicated by this range, and so the deviation of the path for our actual sample outside 
the region is not indicative of any problem with our Aa/a values. Random samples: The 
blue, dotted lines show the la range for 300 samples where we have randomised Aa/a 
over the sightlines. We see that AQ in this case grows rapidly with increased trimming 
for these samples. Our real sample does not do this, which suggests that our real sample 
is significantly dissimilar from a random sample. 



We therefore explore this question through a jack-knife method, where we remove one 
quasar at a time and recalculate the statistical significance of the fit. We show the results 
of this exploration in figure 4.20. The figure clearly demonstrates that, unsurprisingly, our 
result is not due to a single quasar spectrum. We extend this in figure 4.21 to show the 
effect of removing 5 spectra at random. We chose the number 5 in order to potentially 
include the cluster of 5 quasars at RA 22hr, dec ~ —45°, where all of these sightlines 
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demonstrate /S.a/a > 0. Under this circumstance, the probabiHty of obtaining a dipole 
resuh which is insignificant (< 3a) is smah. This suggests that the dipole effect is not 
being created by a smah number of spectra. 




Statistical significance (ct) 

Figure 4.20: Effect of removing quasar spectra on the statistical significance of the dipole, 
as assessed through a jack-knife method. Each spectrum is removed one at a time, and the 
value of the statistical significance of the model Aa/a = ^cos(0) -|- m is calculated with 
respect to the monopole model, using the bootstrap method. We use a Gaussian kernel 
density estimator to construct the approximate probability density function of the effect 
of quasar spectrum removal, where the width of the Gaussian basis functions has been 
chosen to be the inverse of the number of spectra. The cumulative distribution function 
is plotted as a dashed, red line. This demonstrates that the angular dipole effect is not 
due to a single spectrum. 



4-6.11 Comment on the removal of outliers in the Keck and VLT sam- 
ples 

In each of the VLT and Keck samples we have removed one putative outlier, which in each 
sample represents less than one percent of the Aa/a values. It is possible to calculate 
dipole significances and parameter values with these points included, but it is not clear 
what intepretation to place on these numbers on account of the arguments in section 4- 
4.8.4. In particular, such a fit is immediately called into question on the basis of the fact 
that it contains outliers. Nevertheless, we tried such a fit and the dipole significance is 
not substantially altered. 
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Figure 4.21: Effect of removing quasar spectra on the statistical significance of the 
dipole, as assessed through a sampling method. For 100,000 samples, 5 quasar spectra are 
randomly removed from the combined Keck + VLT sample, and the statistical significance 
of the dipole model Aa/a = Acos(G) + m is calculated through the method of Cooke 
& Lyndcn-Bcll (2010). Gaussian basis functions with width 1/316 (=10^/^) 

are used. 

This graph demonstrates that, in the absence of particular knowledge about problematic 
spectra, the chance of obtaining a dipole model where the statistical significance of the 
dipole is less than Scj is small as a result of randomly removing 5 spectra. 

4-7 Translation from an angular variation model to a phys- 
ical model including a distance measure 



We now explore simple phenomenological parameterisations of the dipole effect which 
attempt to account for distance dependence. In all of these models, the same Aa/a 
values identified as outliers previously have been removed from considerations. 



4-7.1 dipole 

To model potential distance dependence directly with the observable quantity, z, we fit a 
power-law relationship of the form 

Aa/a = Cz^ cos(e) m (4.50) 

for some /3 and amplitude C . For a fit to the combined Keck -|- VLT samples this gives 
the "z'^ dipole" sample. 

We use the Levenberg-Marquardt algorithm (Press et al., 1992) to fit equation 4.50 to the 
combined sample. This fit yields RA = (17.5 ± 1.0) hr, dec = (-62 ± 10)°, C = 0.81 (la 
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confidence limits [0.55, 1.09] x lO'^), m = (-0.184 ± 0.085) x 10"^ and (3 = 0.46 ± 0.49. 
The fact that the amplitude grows as a low power of z, and the fact that it is statistically 
consistent with zero, is the reason that the approximation ^ ~ Cz*^ yields reasonable 
results earlier. We show the results of this fit in figure 4.22. This dipole + monopole model 
is statistically preferred over the monopole-only model at the 99.99 percent confidence level 
(3.9o"). The reduction in significance relative the angular dipole model occurs as a result 
of the uncertainty in determining /3, but is relatively small. 
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Figure 4.22: Binned values of Aa/a plotted against z^cosQ, for /? = 0.41. The 
dipole+monopole model is preferred over the monopole-only model at the 3.9a level. Im- 
portantly, this plot only covers |2:^cos(0)| < 1. Given that it is possible to probe up to 
redshift z < 4 with the MM method, judicious choice of observational targets close to the 
dipole axis might be able to extend this horizontal range of this graph up to ~ ±2, thereby 
potentially increasing sensitivity to the effect substantially, if the effect is real. 

Note that the standard practice of fitting for Aa/a as a function of redshift {Aa/a = 
az+m) is subsumed within this analysis, which directly determines the scaling relationship 
of Aa / a with redshift (and whether it is statistically compatible with linearity) . 



4-7.2 r-dipole 



Another plausible alternative is to try to relate the amplitude of the dipole to some explicit 
distance metric. For simplicity, we use the "lookback-time distance". This is defined by 
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r = ct, where c is the speed of hght and t is the lookback time to the absorber. Thus, we 
try a fit of the form 

Aa/a = Brcos{@) + m. (4.51) 

To calculate lookback times, we use the standard ACDM (A Cold Dark Matter) model, 
with parameters given by the 5-year WMAP (Wilkinson Microwave Anisotropy Probe) 
results (Hinshaw et al., 2009). We note that this calculation is derived from the FLRW 
(Friedmann-Lemaitre-Robertson- Walker) metric, which assumes isotropy of the universe. 
Our model implies anisotropy of the universe, and therefore use of the FLRW metric 
is strictly incorrect. Nevertheless, as Aa/a ^ 1 we assume that the FRLW metric is 
a good approximation to the actual metric, and therefore that our lookback times are 
approximately correct. The ACDM parameters used are: {Hq, Qm, ^a) = (70.5, 0.2736, 
0.726). 

We show in figure 4.23 the fit of Aa/a to a combined VLT -|- Keck sample ("combined 
r-dipole sample") against rcos(0) = ctcos(0). The parameters for this fit are: B = 
1.1 X 10-6 GLyr"^ {la confidence limits [0.9,1.3] x lO^^GLyr"^), RA = (17.5 ± 1.0) hr, 
dec = (-62 ± 10)° and m = (-0.187 ± 0.084) x 10"^. Using the bootstrap method we 
assess the statistical significance of this fit with respect to the monopole-only fit as 4.15a. 
In figure 4.24, we show the confidence regions on the dipole location for the VLT, Keck 
and combined samples on the sky. 

In galactic coordinates, the pole of this fit is at approximately {l,b) = (330°, —15°). The 
fact that the pole and antipole are close to the Galactic Plane explains the relative lack of 
absorbers near to the pole and antipole in both the Keck and VLT samples, a fact made 
obvious in figure 4.14 earlier. 

If we adopt a dipole-only model, 

Aa/a = Br cos{e), (4.52) 

we derive B = 1.1 x 10"^ GLyr-i(lcj confidence limits [0.9,1.3] x 10"^), RA = (17.4 ± 
0.9) hr, dec = (—58 it 9)°. The statistical significance of the dipole model is 99.998 percent 
(4.22o"). The confidence limits on the dipole location for this fit for the VLT, Keck and 
combined samples are shown in figure 4.24. 

4-8 Summary 

In this chapter we have presented 154 new many-multiplet constraints on Aa/a derived 
from spectra obtained using VLT/UVES. A simple weighted mean analysis shows that 
these values of Aa/a appear inconsistent with the Keck results of Murphy et al. (2004). 
However, if we consider that angular (and therefore spatial) variations in a are possible. 
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Figure 4.23: Binned values of /S.a/a plotted against rcos(0) = cicosG, where t is the 
look-back time to a redshift z. ACDM parameters are from Hinshaw ct al. (2009). The top 
panel (triangles) shows the VLT -|- Keck sample, binned with approximately 25 absorbers 
per bin. The bottom panel shows VLT (circles) and Keck (squares) Aa/a values, binned 
with approximately 12 absorbers per bin. The red (solid) line in both cases shows the 
model, Aa/a = i?r cos(0) -|- m. The parameters for the fit are: B = 1.1 x 10~^ GLyr~^ 
{1(7 confidence limits [0.9, 1.3] x 10"^ GLyr"^), RA = (17.5 ± 1.0) hr, dec. = (-62 ± 10)° 
and m = (—0.187 it 0.084) x 10^^. It is interesting that this simple model is a reasonable 
representation of the data. 
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Right Ascension (hours) 



Figure 4.24: Sky map in equatorial coordinates showing the 68.3 percent (Ifi equivalent) 
confidence limits of the location of the pole of the dipole for a fit to the Keck Aa/a values 
(green region), VLT Aa/a values (blue region) and combined Aa/a values (red region), 
for a fit of Aa/a = Brcos{Q) + m {top figure) and Aa/a = Brcos{Q) {bottom figure), 
where r = ct and t is the lookback time to the absorber. The pole and antipole of the 
CMB dipole are marked as -Pcmb and ^cmb respectively. In the model which includes 
a monopole (top figure), the Keck confidence region is large due to a relative degeneracy 
with the monopole; the region is much smaller in the bottom figure on account of no 
monopole term being included. 

then the two data sets are rendered consistent with each other. The combination of the 
two data sets yields statistically significant evidence for angular variations in a at the 
4.1(7 level, with the best-fitting dipole having an angular amplitude of 0.97l^g 2o ^ 10^^, 
and pointing in the direction RA = (17.3 it 1.0) hr, dec = (—61 it 10)°. If we consider a 
simple model for distance using the lookback-time distance, we find that the statistical 
significance of the dipole increases to 4.2a. In this case the dipole has an amplitude of 
(1.1 =b 0.2) X 10~^ Glyr~^, and points in a similar direction. 

The data display a remarkable consistency. Dipole fits to low {z < 1.6) and high {z > 1.6) 
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cuts of the sample point in a similar direction, as do dipole fits to the Keck and VLT 
data separately. Similarly, the significance of the dipole is robust under removal of data 
at random. If we take a less conservative approach to treating the data, the significance 
of the dipole approaches la. 

A weighted mean of the VLT and Keck /S.a/a values (a whole-sample monopolc) yields 
(Aa/a)ti, = (—0.216 it 0.086) x 10^^. However, this value should be interpreted with 
caution, given the fact that there appears to be significant angular dependence for a and 
the fact that the monopole takes on significantly different values at low {z < 1.6) and high 
{z > 1.6) redshift. 

The cause of the difference between the monopole at low and high redshifts is unknown, 
and is a weakness of the results presented here. We argued that the most likely explanation 

for this is due to evolution in the abundance of magnesium isotopes, and discussed other 
possible explanations. Due to the fact that most of the significance for the dipole originates 
at high redshifts, where the monopole is not present, and because of the consistency 
between low- and high-redshift samples, and between the Keck and VLT results, we do 
not think that this significantly affects the evidence for spatial variation of a. 

The results of this chapter therefore yield significant statistical evidence for spatial varia- 
tion in the fine-structure constant. 

It is possible that the results presented here are the result of some unknown systematic 
effect, or combination of systematic effects. We discuss potential systematic effects in the 
next chapter. 
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Systematic errors for /\a/a 



5-1 Introduction 

It is easy to conceive of a large number of possible systematic effects which could, if 
present, spuriously generate a specific form for a non-zero Aq/q. In particular, if one 
assumes that Aa/a is well described by a weighted mean, or one considers the monopole 
term of our dipole fit, there are a large number of effects which could push either of these 
values away from zero. 

To generate angular variation in a from a systematic effect is, however, rather harder 
than producing an offset from Aa/a = 0. Any such effect — if it exists — must be well 
correlated with sky position or must be a combination of systematics that by coincidence 
mimics angular variation in a. On the whole, we argue that a detection of a angular 
variation in a is relatively robust to potential systematic effects. Nevertheless, in this 
chapter we explore the potential impact of a number of systematic effects. 

Murphy et al. (2001a, 2003a) considered a wide range of potential systematic effects in 
relation to the Keck results, including: "potential kinematic effects, line blending, wave- 
length miscalibration, spectrograph temperature variations, atmospheric dispersion and 
isotopic/hyperfine-structure effects". They concluded that only the latter two effects are 
potentially large enough to be of significance, and that neither of these can explain the 
Keck results. 

Some of these potential systematic errors are common to the VLT sample because we 
observe the same types of absorbers as are in the Keck sample, and the impact of many 
of them in the VLT should be similar to the Keck sample because the statistical con- 
straints on Aa/a from individual absorbers in the VLT sample is of the same order of 
magnitude as that from absorbers in the Keck sample. Certainly, the same considerations 
regarding potential kinematic effects and line blending apply, and so these effects should 
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also not significantly affect our results. UVES has operated with an image rotator since 
observations commenced, and so the concern about atmospheric dispersion that is present 
for some of the Keck sample does not affect the VLT sample. Spectrograph temperature 
variations should also be small, as the UVES enclosure is thermally isolated, and the VLT 
enclosure is air-conditioned to minimise thermal variation (D'Odorico et al., 2000). Given 
that spectrograph temperature variations are unable to explain the Keck result, the design 
of UVES in this respect should ensure that such effects are negligible in our sample. 

It is conceivable that telescope flexure could induce some systematic effect into the Aa/a 
results. In the most obvious case this would make Aa/a correlated with the zenith angle 
of the observations. Murphy et al. (2003a) explicitly considered the possibility that Aa/a 
could be correlated with zenith angle, and found no evidence for significant correlation, 
which seems to rule out this problem in the Keck sample. We have not explicitly addressed 
this concern here given the findings of Murphy et al. , and note that any systematic which 
mimics angular variation in a must not only be well correlated with sky position, but 
must do so in a way which is consistent between the two telescopes. A systematic which is 
correlated with zenith angle is not sufficient to produce the observed dipole effect; such an 
effect should produce a variation in a that is approximately symmetric about the latitudes 
of the telescopes projected onto the sky (i.e. dec.~ 20° for Keck and dec. ~ —25° for VLT), 
which is not what is seen. Importantly, such an effect is unable to produce the consistency 
observed between the dipole locations. 

Now that we are utilising data from two telescopes, the obvious question arises as to 
whether some difference between the telescopes could manufacture or alter a dipole signal. 
The fact that a 2.2a dipole is seen in the VLT data alone (section 4-5.2) and that there 
is good alignment between dipoles fitted to the Keck and VLT samples (section 4-6.5) 
suggests that inter-telescope differences are not responsible for the observed effect. One 
way of trying to determine the impact of differences between the telescopes would be to 
attempt to calculate any such differences from first principles. However, any potential 
systematics are likely to be extremely subtle, and depend on a variety of factors relating 
to the telescopes and instruments. A more direct approach is to compare spectra of the 
same objects taken by both telescopes. Absorption features in these spectra should appear 
at the same wavelengths in spectra from both telescopes^. Any difference constitutes a 
relative distortion of the wavelength scale between the two telescopes. This technique is 
powerful, and does not require a priori knowledge of how the wavelength scale distortions 
are generated. Because of the importance of this technique, we present it first as the Ai; 
test in section 5-2. 

^This is not strictly true: if the dynamical timescale of the absorption process is comparable to the time 
difference between exposures, then evolution in the absorption features is possible. We include changes in 
the position of the gas clouds in the definition of the dynamical timescale, as proper motion of the clouds 
could produce changes in the observed column density. For transitions with multiple velocity components, 
this will produce apparent shifts in line centroids. 
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The issue of wavelength calibration is potentially tricky. Since the work of Murphy et al. 
(2003a), wavelength scale distortions have been identified within echelle orders in both 
Keck/HIRES (Griest et al., 2010) and VLT/UVES (Whitmore et al., 2010). We discussed 
the potential origin of these in section 3-6.4.2 in the context of A/i//i. As in that section, 
the fact that we use transitions across the whole optical range combined with the non- 
monotonic nature of the distortions means that any bias introduced into Aa/a by these 
distortions should average out over a large enough sample of absorbers. Murphy et al. 
(2009) explored the impact of distortions of this type on the Keck results and found that 
the impact on the weighted mean was effectively negligible. Nevertheless, it is worth 
exploring this effect further, and we do this in section 5-4. 

We explore the potential impact of the fact that UVES is a dual-armed spectrograph in 
section 5-5. 

As was done by Murphy et al. (2003a), we explore the effect of a different heavy Mg isotope 
fraction in the quasar absorbers relative to terrestrial values in section 5-6. 

5-2 Inter-telescope systematics and the Av test 

Suppose that some systematic effect existed which was intrinsic to the telescope which 
created a distortion of the wavelength scale. Two possible types of wavelength distortions 
exist: stationary and non-stationary. Stationary (i.e. time-invariant) distortions could be 
produced due to some intrinsic aspect of the telescope or instrument. Non-stationary 
distortions could be produced by a wide number of phenomena, including atmospheric 
effects and the method through which the telescope tracks the quasar source (i.e. the ac- 
curacy of slit centering) . All of the Keck spectra used in the analysis in this paper were 
acquired whilst HIRES had only one CCD chip. In this configuration, multiple exposures 
are needed to yield full wavelength coverage. If the quasar image is not precisely centred 
in the spectrograph slit for every exposure, velocity offsets between spectral segments ob- 
tained at different times are possible. This issue should be substantially mitigated at VLT, 
as UVES can acquire almost the entire spectral range in a single observation. The effect 
could be exacerbated in conditions of good seeing and could include an additional small 
effect due to the seeing profile decreasing slightly towards the red end of the spectrum. 

It so happens that the VLT and Keck samples have 7 quasars in common. We give a list 
of the quasars common to the VLT and Keck samples in table 5.1. The use of common 
sources allows one to search for problems with wavelength calibration; absorption features 
should be found at the same barycentric vacuum wavelength between different exposures. 
This inspires a method of searching for distortions of the wavelength scale in both the 
Keck and VLT spectra. In the simplest sense, one aims to cross-correlate particular 
patches of spectra and try to verify whether absorption features really do occur at the 
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same wavelength, or whether some correction is required to achieve a good match. Note 
that the number of absorption lines which can be used for this purpose is much larger than 
is used for analysing Aa/a. Whilst for Aa/a many absorption lines are needed to yield a 
single measurement of Aa/a, in principle each absorption line in the spectrum yields one 
constraint on potential wavelength distortion. 

Table 5.1: List of quasars common to the Keck and VLT samples. Keck names are given 
in B1950 format, whilst VLT names are given in J2000 format. 



One possibiUty is to use direct cross- correlation methods, however this suffers from the 
fact that the spectral resolutions of VLT and Keck spectra are different, and so direct 
cross-correlation requires rebinning of the spectra onto a common wavelength scale. A 
more inspired approach is to actually model the quasar absorbers directly. By imposing 
an assumption about the nature of the observed profiles (namely that they are Voigt 
profiles), one can obtain substantially tighter constraints on any wavelength distortion. 

To explore potential wavelength scale distortions, we use a method which we refer to 
as the At; test. The method proceeds as follows: i) for each common quasar, visually 
identify regions of non-terrestrial absorption, typically having width of a few A; ii) for 
each of these regions, perform a Voigt profile fit to the VLT spectral data (identification 
of the transition responsible is unimportant); in) fit corresponding spectral regions of the 
Keck and VLT simultaneously, but with an extra free parameter, Av, which allows for 
a velocity shift between the two spectral regions. R. F. Carswell has kindly modified 
VPFIT to be able to estimate At>. The VLT spectral data for these regions were kindly 
fitted by M. Bainbridge using an automated Voigt profile fitting routine designed to fit 
regions of the forest automatically, and he has provided us with Aw values derived from 
the joint fits to the Keck and VLT data. /S.v is defined hereafter as the velocity difference 
At; = t;(VLT) — i;(Kcck) which must be applied to minimise between two comparable 
spectral regions. In particular, this means for a particular transition that 



Each value of At; provides an estimate of the velocity offset between the two telescopes at 
that observed wavelength, giving At;(A). One can therefore examine the functional form 
of At;(A)i, where i refers to the zth quasar pair under consideration. For each set of At; 
values from a spectral pair, we use the LTS method to calculate the weighted mean of that 



Keck sample name VLT sample name 



0216+0803 J021857+081727 

0237-233 J024008-230915 

0940-1050 J094253-110426 

1202-0725 J120523-074232 

0528-250 J053007-250329 

1337-M121 J134002-M10630 

2206-1958 J220852- 194359 




(5.1) 
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set of Av values, which we then subtract from the Av values for that spectral pair. This 
is to remove any constant offset resulting from mis-centering of the quasar within the slit. 
We use k = 0.95n for the LTS fit (see section 4-4.8.4). 

Any relative wavelength scale distortion can in principle be removed by applying an inverse 
function based on the observed Av data. To see this, consider the form of the distortion. 
For an absorption line with rest wavelength Aq, observed wavelength Aj, and velocity 
distortion Av then 

Ai = Ao(l + 2) (^1 + ^) (5.2) 

where we have assumed that Av is constant over the absorption profile under consideration. 
The effect of Aa/a can be ignored — whatever transition is being examined is the same 
in both spectra, and so any effect due to a change in constants will be absorbed into the 
determination of z. There are two options to attempt to remove the wavelength scale 
distortion given some function A?;(Aobs)- One could modify the spectral data, changing 
the observed wavelengths as 

1 + Av{Xohs)/c 

When one fits a particular transition, the other possibility is to perturb the rest wavelength 
of the transition fitted, as 

Av" 



Ao ^ Ao (^1 + — j (5.4) 

We use the second option for ease of implementation within vpfit. Doing this means that 
the value of Aa/a derived from the fit will be the same as if the wavelength scale from the 
other telescope in the spectral pair had been used, thereby removing any inter-telescope 
differences (provided that Av is correctly specified). 

In all our analysis in this section we have removed those absorbers which were previously 
flagged as outliers from consideration in the statistical analysis. 



5-2.1 The Av data 

We show the At; data for 6 of the quasar spectral pairs ("core pairs"), which appear similar 
to each other, in figure 5.1. We analyse the At; data from these quasars in the following 
section. We noticed a problem with the 7th pair, 2206- 1958/ J220852- 194359, which 
displays variations of Av with wavelength which are grossly different from the other six 
pairs. A systematic trend in At; is seen, with a maximum difference in Av of ~ 2.5 kms~^ 
over the range 4000 ^ A < 6OOOA. In section 5-2.3, we apply an inverse function derived 
from the At; data seen in this spectral pair to all the VLT spectra and show that a 
distortion of this type cannot affect all the data. We consider the joint impact of the At; 
functions from the 6 core quasars and from 2206— 1958/J220852— 194359 in section 5-2.4. 
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Figure 5.1: Binned plot of Ati values for the six core quasar pairs with 5 points per 
bin. values are calculated as the weighted mean of Aw values contributing to the bin. 
Each set of At> values are normalised to (Av) = as described in section 5-2.2. Note that 
the different quasar pairs sample different regions of the wavelength space, and that some 
pairs provide substantially more points than others. 
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We note the presence of significant outhers within the Av data from the six core pairs. 
Therefore we rely wholly on robust statistical methods to estimate parameters for phe- 
nomenological models of Av{X). 

5-2.2 Core pairs 

In figure 5.1 we show binned values of Av{X) for the six core quasar pairs. The trend in 
each spectral pair is different, but no common trend is seen. For instance, it appears (by 
eye) from 0216/J021857 that Av increases with increasing wavelength. It is difficult to 
conclude what the functional form of Av{X) is from 1337/J134002 and 0237/J024008 due 
to a paucity of data, although 0237/J024008 suggests no significant trend. 0528/J053007 
seems to suggest that Av decreases markedly with increasing wavelength. The conclusion 
from 1202/ J 120523 is unclear, and the interpretation from 0940/J094253 is complicated 
by non-linear behaviour. Importantly, the functional form of Av appears to differ in both 
magnitude and sign between quasars. This suggests that any relative wavelength distortion 
is likely to average out over a large number of absorbers. Additionally, the wavelength 
coverage of the Av data for most spectra is significantly smaller than the wavelength 
range within which MM absorbers are fitted. This means that from each spectral pair it 
is impossible to tell what the wavelength distortion might be over large amounts of the 
spectral range. 

5-2.2.1 Linear fit 

Due to fact that the At; values from each spectral pair do not densely span the whole 
spectroscopic wavelength range, we combine the Av values from each of the six core 
pairs together in order to estimate a common function which spans the full wavelength 
range. The functional form of this is unknown, however a high-order polynomial cannot 
be statistically supported. We use a linear function as a first approximation. We fit the 
linear function with the LTS method, using k = 0.95?7-. We show this linear fit in figure 
5.2. For the form 

Av = aX + b, (5.5) 

a = (-7 ± 14) X lO^^kms^^A^i and b = 0.38 ± 0.71 kms"^ Note firstly that a is 
statistically consistent with zero. Therefore, it is difficult to conclude that a common 
linear systematic exists in the Av data. Nevertheless, in section 5-2.2.3 we apply an 
inverse function of this form to the VLT spectral data to determine the effect that a 
wavelength distortion of this type and magnitude would have on the dipole in section 
5-2.2.3. 
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4000 5000 6000 7000 8000 9000 

Wavelength (A) 

Figure 5.2: Binned plot of Aw values for the six core quasar pairs with a LTS linear 
fit. Only points which contribute to the fit are shown (that is, the worst 5 percent of 
the data has been excluded). Most transitions used in the MM analysis fall in the range 
4000 ^ A < 7000A; from this graph there is no significant evidence for a significant 
wavelength distortion in this region. 

5-2.2.2 Reasonableness of A; = 0.95n 

A legitimate question to ask is whether the choice oi k = 0.95n is reasonable. We show 
the effect of difference choices of k in figure 5.3. Our estimate of the slope is not overly 
sensitive to a choice of k. With the exception of a small region around k = 0.7n, the 
general trend is for the slope to decrease with decreasing k. The fact that the slope 
decreases with increased trimming implies that the underlying trend may be less than 
what we have estimated. 

5-2.2.3 Application to the VLT sample 

To investigate the effect of the potential wavelength distortions from the 6 core pairs, we 
apply an inverse At; function (equation 5.5) to all the VLT absorbers by perturbing the 
rest wavelengths of the transitions fitted in each absorber, as described in section 5-2. We 
apply the same linear function in every VLT spectrum fitted. This therefore puts the VLT 
and Keck spectral data on a common wavelength scale. Any observed angular variation in 
a which survives the inverse function can not be due to stable inter-telescope wavelength 
calibration differences. 
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Percent of data trimmed (%) 

Figure 5.3: Effect of trimming fraction on the slope of a LTS linear fit of Av vs A for 
the six core quasar pairs. The solid, red line corresponds to a weighted LTS fit, whilst the 
dashed, blue line corresponds to an unweighted fit. We have modelled the distortion with 
amplitude Av = —7 x 10^^ km s^^A^^, which is as large or larger than the weighted fit over 
most of the range considered. As a general trend, as the trimming fraction increases, the 
amplitude decreases, which therefore suggests that our chosen magnitude is a reasonable 
estimate of the maximum distortion allowed by the data. 



To investigate the effect of the potential wavelength distortions from the 6 core pairs, 
we apply an inverse Av function (equation ) to the VLT data, which therefore puts the 
VLT and Keck data on a common wavelength scale. Any observed angular variation in 
Q which survives the inverse function can not be due to stable inter-telescope wavelength 
calibration differences. 

Because we apply the inverse function by perturbing the rest wavelengths of transitions 
in absorbers fitted, we can only do this where each fitted transition occurs in only one 
spectral region in a particular fit. There are two pairs of absorbers where we have fitted 
both absorbers in the pair simultaneously, because a transition from one absorber in the 
pair overlaps with a transition from the other absorber (at a different redshift) in the pair. 
In this case, a particular transition can be fitted twice (in two absorbers, in two widely 
separated spectral regions). However, we cannot apply two different perturbations to a 
single rest wavelength. Therefore, we remove these two pairs of absorbers to form a "VLT 
reference set". Thus, we compare the effect of the VLT set of Aa/a values where the 
Av inverse function has been applied with the Aa/a values from a VLT reference set. 
The two pairs of absorbers which are removed are the z ~ 2.253 and z ~ 2.380 absorbers 
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associated witli J214225— 442018 and the z ~ 1.154 and z ~ 0.987 absorbers in the same 
spectrum (i.e. 4 absorbers are removed to form the VLT reference set). This means that 
the VLT reference set contains 149 absorbers. 

In table 5.2, we give the results of applying the At; inverse function above those absorbers 
in the VLT set. The effect is generally to push Aa/a to more negative values. We 
show an updated plot of the confidence regions of the Keck, VLT and combined dipole 
locations in figure 5.4. Although the statistical significance of the dipole decreases from 
3.9(7 (reference set) to 3. la, the position of the VLT (and therefore combined) dipole 
is effectively unchanged. This accords well with our earlier argument that because the 
detection of a dipole is a differential effect, it is difficult to emulate through any simple 
systematic. The Keck and VLT dipoles in this case are separated by 25°, which has 
a chance probability of 7 percent (see section 4-6.5). Also note that introducing this 
modification to the wavelength scale of the VLT spectra does not significantly change the 
good alignment between the z < 1.6 and z > 1.6 samples. The dipole directions in this 
case are separated by 13°, which has a chance probability of 2 percent. 




Right Ascension (hours) 



Figure 5.4: la confidence regions for the Keck (green), VLT (blue) and combined (red) 
dipoles, where the linear inverse function derived from the At; data from the 6 core spectral 
pairs has been applied to the VLT set, under the model Aa/a = AcosQ + m. Although 
the statistical significance of the dipole decreases from 3.9(T to 3.1(7, the positions of the 
VLT and combined dipole are effectively unchanged. 

e would expect that any good model for a wavelength-dependent systematic should quan- 
titatively improve the fit of the dipole model to the Aa/a values. To see if the Af model 
significantly improves the fit, we compare the AICC of the model Aa/a = AcosQ + m 
fitted to different sets of Aa/a values: i) AICCvlt.Ai); the AICC of the angular dipole 
model fitted to the VLT absorbers in the VLT reference set with the linear Au function 
described above applied; ii) AICCvLT,ref) the AICC of the angular dipole model fitted 
to the VLT absorbers in the VLT reference set; in) AICCvLT,Ai)+Keck) 

the AICC for the 

angular dipole model fitted to the absorbers in set (i) combined with the Keck absorbers in 
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the Keck04-dipole set, and; iv) AICCvLT.ref+Keck, the AICC for the angular dipole model 
fitted to the absorbers in set (ii) combined with the Keck absorbers in the Keck04-dipole 
set. In all cases we apply the same values of cxrand to the VLT Aa/a values in order to 
compare points on a like-with-like basis. We find that AICCvlt.Ai; — AICCvLT,ref ^ —0.8, 
indicating that the set of VLT absorbers with the Aw inverse function applied is preferred, 
but not significantly. Comparing the VLT+Keck set to the equivalent reference set, we 
find that AICCvlt. ,At)+Keck AICCvLT.rcf+Kcck ~ 3.2, indicating that the VLT+Keck set 
where the Au inverse function has been applied to the VLT absorbers is weakly preferred. 
However, when comparing the reference sets and the Av sets, the AICC does not account 
for the extra two parameters for the linear model of At> vs A. Thus, with a two-parameter 
model for the Av function there is no significant preference for the Au results, and thus 
there is no strong evidence in the Aq/q values themselves for a wavelength distortion of 
this type. 

In deriving the results above, we have assumed that Au values from different spectral 
pairs may be legitimately combined in order to estimate a common systematic. This may 
not be a good assumption, given the differences in the signal-to-noise of the spectral data, 
the spectral range which the Aw values cover, the potential functional form of Au (A)j and 
number of exposures. We then proceed as follows: i) fit a, linear model to the Aw values 
from each spectral pair using the LTS method; ii) from each model, estimate Au(A) 
along with an uncertainty on the estimate; Hi) for the six estimates of Au at each A, 
form a weighted mean of the estimates, Afu;(A), and calculate the associated uncertainty, 
and; iv) plot /S.Vw{^) as a function of wavelength. We show the result of this in figure 
5.5. Importantly, under this model we can find no wavelength where /S.v is statistically 
different from zero. 

5-2.2.4 Skeptical Bayesian Linear Regression 

Given the large range in the statistical error bars on the Av data, it is possible that by 
discarding even a few percent of the data we are also discarding the data with the highest 
statistical precision. Clearly, if these data are strongly inconsistent with the general trend 
given by the majority of the data then their relative influence should be downweighted. 
To investigate how all the data might be used without needing to decide what fraction of 
the data should be trimmed, we apply the SBLR method of section 4-4.8.3. That is, we 
use a Bayesian method where we regard the statistical errors as lower bounds on the true 
error. 

To maximise the likelihood, L, in equation 4.33 we use a simplex algorithm from Press 
et al. (1992), which does not require knowledge of the derivatives of L with respect to the 
parameters. Because of the functional form of L, the possibility for multiple likelihood 
maxima arises (Silvia & Skilling, 2006). For parameter estimates, one is interested in 
the global likelihood maximum. To avoid this potential trap, we choose a wide range 
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Figure 5.5: Av^i^), a joint estimate for Av{\) from the 6 core quasar pairs made without 
combining the data into a single linear fit. The red (solid) line shows the estimate for Av, 
and the dashed (blue) lines show the 95 percent confidence interval on the estimate. One 
can see that over the range where most of our Av data are obtained (4000A < A < 7000A) 
that Av is relatively flat. Av diverges from zero for A < 3500A (not shown) and for 
A > 7000A, however there are few Av measurements to constrain Av in these regions. At 
no wavelength is Av significantly different from zero. Because the linear model used for 
each quasar fit is unlikely to be a true description of any underlying wavelength distortion, 
there is also uncertainty due to model specification, which is naturally not included in the 
confidence region shown above. As such, the confidence region shown is under-estimated. 
These considerations show that there is no statistically significant evidence for a common 
systematic from consideration of the 6 core quasar pairs. 



of plausible starting values for the slope and intercept of the linear function, and run 
the simplex algorithm 10,000 times. We keep the parameters from whichever of those 
iterations produces the maximal L. Application of SBLR to the data yields a slope of 
a = —7.2 X 10~^ kms^^A^^, which accords well with that found through the LTS method 
above. We show the result of this fit in figure 5.6. 

Estimation of the uncertainty on the slope must be done carefully. A simple approach 
is to estimate the Hessian matrix (the matrix of second order partial derivatives) at the 
purported solution using finite difference derivatives, take the inverse to obtain the covari- 
ance matrix, and read off the square roots of the appropriate diagonal entries (Silvia & 
Skilhng, 2006). In the large-data limit this approach will be valid on account of the central 
limit theorem. However, because of the functional form of equation 4.33: i) there can be 
multiple likelihood maxima for small sample sizes, and ii) the likelihood function will have 
fatter tails than a Gaussian. Therefore the formal covariance matrix at the best-fitting 
solution is likely to under-estimate the true uncertainty. The formal covariance matrix of 
the fit for figure 5.6 gives the error as 4.0 x 10~^ kms^^A^^, and so the slope differs from 
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Figure 5.6: Skeptical Bayesian linear regression (SBLR) applied to the At> results for the 
six core quasar pairs. Data here are shown unbinned, so the full effect of the outliers can 
be seen. Note that although outliers are still present in the fit, they do not appear to bias 
the result. In particular, the point at approximately 3500A with /\v ~ —3.7 kms~^ would 
otherwise be highly damaging to the fit. The slope of this fit is —7.2 x 10~^ kms^^A^^. 
Analytic la confidence limits on the fit are shown as blue, dashed lines, but for reasons 
given in the text these are inappropriate. 



zero at the l.Sa level (bearing in mind that this must not be converted to a probability 
value using Gaussian statistics unless one believes that the error is Gaussian). 

An alternative method is to explore the likelihood function directly to obtain confidence 
limits on the slope which are not subject to the Gaussian approximation. To do this, we 
utilise the Markov Chain Monte Carlo (MCMC) machinery of chapter 7. We defer a full 
explanation of the mechanics of this method until that chapter. With 250, 000 samples 
of the likelihood function, we find that a = (-7.1 ± 5.9) x 10~^kms-iA"i at the 68.3 
percent confidence level. That is, the 68.3 percent confidence level is some ~ 50 percent 
larger than implied by the formal covariance matrix. At the 95 percent confidence level, 
the error is 11.5 x 10~^ kms~^A~^, and so we can conclude that the slope is statistically 
consistent with zero. We show the probability distribution of a in figure 5.7. 

It is additionally worth exploring what the value of the slope is in individual systems i.e. 
where we do not combine the data from different quasar pairs. We give the results of this 
in table 5.3. Both the magnitudes and signs of the slope vary between the different quasar 
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pairs. Importantly, all of the slopes are reasonably consistent with zero with the exception 
of 0528/J053007, which nevertheless does not deviate strongly from zero. 

Table 5.3: Skeptical Bayesian Linear Regression (SBLR) applied to the 6 core quasar 
pairs. Uncertainties are derived from Markov Chain Monte Carlo methods, and are given 
at the 68.3 percent level (Icr-equivalent). The central estimate is taken here as the mean of 
the upper and lower bounds of the confidence region. Although this does not necessarily 
coincide with the probability mode, the difference in all cases is not significant. The only 
system for which the statistical significance of the slope of Av vs wavelength is significantly 
larger than "la" is 0528-250/J053007-250329. 



Quasar pair Slope of Av vs wavelength (10 ^kms ^ A ^) 



0216+0803/J021857+081727 


7.1 ±33.7 


0237-233/J024008-230915 


0.3 ±22.6 


0940-1050/J094254-110426 


-26.5 ±21.3 


1202-0725/J120523-074232 


-4.2 ± 18.7 


0528-250/J053007-250329 


-23.8 ± 13.4 
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Figure 5.7: Probability distribution for the slope of Av vs wavelength under SBLR for 
the 6 core quasar pairs. The slope is statistically consistent with zero, and therefore there 
is no evidence for a common linear wavelength miscalibration in the 6 core quasar pairs. 



In summary: we are unable to detect a statistically significant linear wavelength distortion 
common to the 6 core spectral pairs. Applying to the entire VLT spectral sample a simple 
linear model for Av{X) from the six core pairs reduces the statistical significance of the 
dipole, but the statistical significance still remains high enough to be of interest. The 
systematic applied here does not destroy the good alignment between the fitted Keck and 
VLT dipole vectors. We are therefore unable to remove the dipole effect from the combined 
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Keck and VLT sample. 



5-2.3 2206-1958/J220852-194359 



In figure 5.8 we show the data for the 2206-1958/J220852-194359 pair. Two things 
are immediately obvious. Firstly, there is a clear anticorrelation of with wavelength. 
Secondly, the magnitude of the effect is extremely large — of the order of |5(Af)| ~ 
2.5kms~^ over the range considered. A wavelength distortion of this magnitude will have 
a substantial impact on determining Aa/a in any spectra affected by it. 
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4000 6000 8000 

Wavelength (A) 

Figure 5.8: Binned /^v results for the 2206-1958/J220852-194359 spectral pair, with 3 
points per bin. Note the substantial slope over a relatively small wavelength range, with 
a range in /S.v of about 2 kms~^. 

The Au test only examines calibration differences between Keck and VLT, and so we 
cannot tell whether Keck or VLT is responsible for significant trend in Aw for this spectral 
pair. 



5-2.3.1 Arctangent fit 

The limited spectral range (4000 < A < 6OOOA) of the Ai; data means that we simply 
do not know what the functional form of Af is for this spectral pair at A > 6OOOA. In 
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order to estimate the potential impact of the distortion present on Aa/a values in the 
whole sample, we need knowledge of Av{X) at all observed wavelengths. One possibility 
is to assume that the relationship is linear, but then the extrapolation over the whole 
spectral range results in a total change in Av of ~ 5kms~^, which is comparable to the 
velocity width of the spectrograph slit; this seems too extreme. Additionally, the Av 
values in figure 5.8 do not seem to be linearly related with wavelength. We therefore try 
a phenomenologically motivated arctangent model. 



Av = Atan'^ [k{X - Ac)] + b. 



(5.6) 



Applying the LTS method to this fit, with A; = 0.95n, yields: A = (-0.98±1.03) kms^^ A"^ 
k = (1.9 ±2.6) X 10"^ Ac = 4547 ±526 A and b = 0.48 ±0.74 km s"\ where errors are 
derived from the diagonal terms of the covariance matrix at the best fit. Each Av uncer- 
tainty has been increased in quadrature with ~ 0.26 km s^^ to account for over-dispersion 
about the LTS fit. We show the results of this fit, along with an extrapolation over the 
useful wavelength range, in figure 5.9. 
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Figure 5.9: LTS arctangent fit applied to Av results for the 2206-1958/J220852-194359 
spectral pair. Results are shown binned, with 5 points per bin. 5 percent of the data is 
ignored in the LTS fit, and only those points which contribute to the fit contribute to the 
bins shown. 
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5-2.3.2 Application to the VLT sample 

The impact of this wavelength perturbation on the values of Aa/a when this function is 
applied to the VLT absorbers is severe. In particular, large numbers of points are scattered 
away from Aa/a = 0, inducing a highly significant detection of Aa/a at z ~ 0.8 and 
z ~ 1.5. For instance, a formal weighted mean of all the 1.3 < z < 1.8 points yields 
Aa/a = (—3.96 it 0.12) x 10~^ — a 33(T "detection". If one multiplies the error by 
to account for — 1^.7 about the weighted mean, then one obtains l^a/a = 
(—3.96 lb 0.46) X 10~^, still an 8.5cr "detection". Such a signal is seen in neither the 
Keck or VLT samples, which immediately implies that this particular relative wavelength 
distortion can not possibly apply to all of either the VLT or Keck spectra. That is, the 
distortions seen in this particular Keck/ VLT spectral pair appear not to be representative 
of a significant fraction of the entire sample. However, the fact that we have identified this 
distortion demonstrates the power and utility of the quasar pair analysis in identifying 
systematic errors, even when their actual origin remains unknown. 



5-2.4 Overall effect of wavelength systematics using the Aw test 

We now investigate whether a diluted form of the above effect (i.e. the effect from the 
2206— 1958/J220852— 194359 pair) could exist in the spectral data in combination with 
the (non-significant and much smaller) effect observed from the 6 core quasars. To do 
this, we use a Monte Carlo approach where at each iteration we apply the inverse function 
derived in section 5-2.2 (equation 5.5) to 6/7 of the quasar spectra selected at random 
in the VLT sample, and the arctan function of equation 5.6 to the remaining 1/7 of the 
spectra. We then apply the LTS method to estimate a new drand- We then add the Keck 
sample to this new VLT sample. At each iteration, we calculate the statistical significance 
of the dipole using the bootstrap method. The mode of the distribution obtained is ~ 2.2a, 
with quite substantial variation between iterations. 

To determine whether the Av function significantly improves the goodness-of-fit in the 
VLT sample, we compare the AICC at each iteration in the Monte Carlo simulation for 
a dipole model (Aa/a = AcosQ + m) fitted to the VLT Aa/a values in that iteration 
with the AICC from a dipole model fitted to the Aa/a values in the VLT reference set, 
where in each iteration we use (Jrand = 0.88 x 10^^ in order to compare Aa/a values on 
a like-with-like basis. We show this distribution in figure 5.10. In only 3.5 percent of 
iterations is the AICC lower than in the reference set. This implies that it is unlikely that 
a wavelength distortion of this type is present in our data set. However, in almost all of 
the iterations the AICC is much larger than the AICC from the VLT reference set; the 
median AAICC = 43.5. Importantly, in no case is the AAICC > 10. Thus in no case 
can we say that there is very strong evidence in favour of the model with the Av function 
applied. Additionally, the AICC does not account for the 6 parameters used in deriving 
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Av model - we should expect a significant reduction in the AICC if the Av function is a 
good model. Prom this argument, we thus conclude that a wavelength distortion of this 
type is unlikely to be present in the VLT spectral data. 
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Figure 5.10: AICC for 1000 iterations of a Monte Carlo simulation where the Av function 
from the 6 core pairs (equation 5.5) is applied to 6/7 of the VLT quasars at random, and 
the Av function from the 2206-1958/J220852-194359 pair (equation 5.6) is applied to 
the remaining 1/7 of the quasars. The AICC here is calculated for the Aa/a data with 
respect to the model Aa/a = Acos{@) + m for the VLT points only. The dashed red 
line shows the actual AICC for the VLT reference set (see section 5-2.2.3). Only in 3.5 
percent of the iterations is the AICC lower than for the reference set, which suggests that 
it is unlikely that a distortion of this type is present in the VLT data. The median value 
of the AICC here is 191.4, corresponding to a median AAICC = 43.5. For the median 
case, the odds against a wavelength distortion of this type being present in the data are 
PS 3 X 10^ : 1. 



5-3 Comment on Griest et al. (2010) I2 and ThAr measure- 
ments on Keck/HIRES 

In section 3-6.4.2 we discussed how wavelength distortions may arise on account of the 
different path that the quasar light may take through the spectrograph compared to the 
thorium-argon (ThAr) light. These distortions may be long or short ranged. Griest et al. 
(2010) detected long and short ranged calibration differences by comparing the calibration 
using an I2 absorption cell and the standard ThAr calibration. In particular, they reported 
drifts between the I2 and ThAr calibration scales of up to 2000 ms~^ over several nights, 
and claimed that that "this level of systematic uncertainty may make it difficult to use 
Keck HIRES data to constrain the change in the fine-structure constant". 
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The test above explicitly includes the effect of any drifts in the wavelength calibration 
both within a single night and between observation nights. From figure 5.2, the RMS 
of the 66 binned points about the fit is ~ 250 ms^^. The mean wavelength separation 
between these points is comparable to the echelle order width. This RMS can therefore be 
compared directly to the spread in f shift seen in figure 5 of Griest et al. (2010). In contrast 
to their spread of ~ 2000 ms^"*^, we see see typical wavelength distortions between VLT 
and Keck which are some 8 times smaller. We have directly quantified the impact of 
this on our measurements of Aa/a in section 5-2.2.3. Our results demonstrate that it is 
possible to reliably use Keck/HIRES data to constrain the fine-structure constant from 
quasar observations. We deal with the intra-order distortions described in the next section. 

5-4 Intra-order wavelength distortions 

In section 3-6.4.2 we noted the presence of intra-order wavelength distortions in both 
Keck/HIRES spectra (Griest et al., 2010) and VLT/UVES spectra (Whitmore et al., 
2010). In this section we attempt to estimate the impact of the extra scatter that has 
already been introduced into the VLT Aa/a as a result of the intra-order distortions 
reported by Whitmore et al. (2010). 

As the distribution of MM transitions is random with respect to the location of the echelle 
orders, the effect of these distortions will be random from absorber to absorber. A distor- 
tion of this type, with no monotonic long-range component, constitutes a random effect 
(see section 4-4.3). Murphy et al. (2009) applied a model of the distortion found by Griest 
et al. (2010) to the 2004 Keck results, and found that the impact on the weighted mean 
of the Aa/a values was effectively negligible. It is also worth noting that systems which 
utilise a large number of MM transitions will be less sensitive to an effect of this type. 
This is because with many transitions, the distortion is sampled in many locations; if the 
distortion does not have a long-range component, the average distortion must tend to 
zero. It is important to note that because the VLT spectra used here are the result of 
the co-addition of many exposures, taken with different echelle grating settings and over 
many nights, it is expected that any distortions of the wavelength scale due to light path 
differences should be reduced in magnitude. Thus, we consider the possible estimate of 
the impact of the effect we present to be an upper limit. 

To investigate the effect of the Whitmore et al. (2010) distortions on the VLT absorbers, 
we used a model constructed from a Fourier analysis of the velocity shift data presented 
in that paper (kindly provided by F. E. Koch). The iodine cell absorption lines used to 
establish the intra-order distortion results only occur over the wavelength range ~ 5000- 
6200AA. We are therefore forced to assume that our model of these distortions applies 
to much bluer and redder wavelengths as well. Clearly, the important part of the model 
is the amplitude rather than the period of the distortions; our model has a maximum 
peak-to-peak distortion of ~ 300 ms^^. We show this model in figure 5.11. 
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Figure 5.11: lS.v function used for the investigating the wavelength distortions found by 
Whitniorc ct al. based on a Fourier analysis of their data. This function is repeated to 
longer and shorter wavelengths. This function was kindly provided by F. E. Koch. 



In table 5.4, we show the result of applying the function shown in figure 5.11 to the VLT 
absorbers using equation 5.4. The impact on the location of the dipole and the value of the 
monopole is minimal, as expected. However, we note that the (Jrand required is somewhat 
larger, which means that this model of the wavelength distortion has introduced extra 
scatter into the Aa/a values. Any good model of the systematic should reduce, not 
increase, the scatter. The extra scatter reduces the significance of the dipole, but does not 
destroy the good alignment between Keck and VLT, nor between low and high redshift 
samples. In particular, the chance probability of alignment for the Keck and VLT samples 
(where the VLT sample has been altered with this Aw model is 6 percent, the chance 
probability of alignment between low and high redshift samples is 4 percent, and the joint 
chance probability for these two factors is 0.3 percent. 

The presence of intra-order wavelength distortions would serve to increase the scatter of 
the Aa/a values about the true values. These distortions can only randomise but not bias 
Act/a values. They can not manufacture a dipole or monopole. Were we able to make 
the same quasar observations without the presence of any wavelength scale distortions, 
the scatter in the Aq/q values about the model should be smaller (and so (Trand would be 
smaller) . We would therefore expect that this would increase the significance of the dipole 
model. Our analysis in this section suggests that the maximal reduction in statistical 
significance of the dipole which may have occurred as a result of intra-order wavelength 
distortions present is ~ 0.6cj. 
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5-5 UVES, a dual-arm spectrograph 

UVES is a dual-arm spectrograph, where the incoming light is split into a red arm and a 
blue arm using a dichroic mirror. In principle, misalignment of the slit in the blue arm 
relative to that in the red arm would produce a distortion of the wavelength scale between 
the two arms, which could mimic a change in a if transitions are fitted simultaneously 
from spectral data from both arms. Molaro et al. (2008) investigated the possibility 
that such misalignment might cause velocity shifts between the blue and red arms, using 
measurements of asteroids, and argued that the two arms do not show separation by more 
than 30 ms^^ in the situation where the science exposures are bracketed by the ThAr 
exposures. A shift of this magnitude is equivalent to Aa/a ~ 0.14 x 10^^ for the Fe ii 
A2382 transition, which is negligible in the context of our sample. 

However, we note that Molaro et al. used a slit with of 0.5", which is rather different to 
the ~ 0.7" to ~ 1.0" typical of the quasar exposures. The UVES archive indicates that, 
for the observations of Molaro et al., the seeing was always poorer than the slit width. If 
the slits for the blue and red arms are misaligned, one would expect the induced effect 
on wavelength calibration to depend on slit size. In the seeing-limited regime, the slit is 
relatively uniformly illuminated, and therefore the observed science wavelengths should be 
well calibrated through the ThAr exposure. On the other hand, when the seeing is much 
better than the slit, one might expect to see larger differences, if such differences exist. 

5-6 The effect of isotopic abundances 

Most of the atomic species we use have a number of stable isotopes, and each of these 
isotopes exhibits a slightly different rest wavelength for a given transition. The isotopic 
spacing depends on the transition and species under consideration, but scales according 
to the inverse square of the mass. That is, Acjj oc LOQ/mf. The isotopic shifts of Mg, 
as the lightest of the elements species under consideration, are relatively significant. The 
terrestrial abundance of the Mg isotopes is ^^Mg:^^Mg:^^Mg = 79:10:11 (Rosman & Taylor, 
1998). We define the heavy isotope fraction as T = (^^Mg + ^^Mg)/Mg, which has a 
terrestrial value of T^ = 0.21. 

We have assumed for our final fits that the quasar absorber isotopic abundances are the 
same as the terrestrial abundances. However, if the abundances in the absorbers differs 
from the terrestrial abundances, this will introduce a small but potentially significant shift 
in the quasar absorption lines compared to laboratory measurements. Mg will be most 
affected by this, due to its low atomic mass compared to the other species. Previous work 
(Murphy et al., 2003a) noted that the effect could be particularly significant for low-z 
absorbers, as these predominantly consist of the Fe/Mg combination. High-z absorbers 
are less likely to be affected due to the use of more massive anchors (Si and Al), for 
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which this effect is less relevant. Additionally, the use of many transitions with differing 
q coefficients at high redshift will tend to reduce the importance of this effect (Murphy 
et al., 2004). 

Both observations (Gay & Lambert, 2000) and theoretical estimates (Timmes et al., 1995) 
of stellar abundances for Mg suggest that the heavy isotope abundance of Mg (i.e. the ^^Mg 
and ^^Mg isotopes) decreases with decreasing metallicity. Murphy et al. (2003a) noted that 
the low-z Mg/Fe systems considered in the Keck sample have relative metal abundances, 
[Fe/H], in the range —2.5 to 0.0, whereas the high-z DLA systems have relative metal 
abundances of about —1.0. Therefore, the quasar absorbers we consider may also have 
sub-solar metallicities. However, observations of some low metallicity red giants show 
significant enrichment of the heavy Mg isotopes. Ashenfelter et al. (2004a, b) considered 
a "modest" enhancement of the stellar initial mass function (IMF) for intermediate mass 
stars {M ~ 5M0), and showed that this could produce F ~ 0.4 for [Fe/H] ~ —1.5. Fenner 
et al. (2005) argued that such an IMF would substantially overproduce nitrogen relative to 
observations, and therefore that this mechanism of creating F > Ft does not seem possible. 

However, it appears that the link between stellar evolution and the likely nitrogen abun- 
dance in quasar absorbers is not fully understood. Centurion et al. (2003) described 
observations of extremely low relative abundances of nitrogen in DLAs, and thus argued 
that nitrogen production cannot be dominated by massive stars. In a detailed study, 
Dessauges-Zavadsky et al. (2007) argued that "no single star formation history explains 
the diverse sets of abundance patterns in DLAs". Melendez & Cohen (2007) claimed (in 
contrast to previous analyses) that heavy Mg isotope enrichment due to AGB stars in the 
Galaxy halo does not occur until [Fe/H] > —1.5. Levshakov et al. (2009) examined 11 
metal-rich, high-redshift (1.5 < z < 2.9) quasar absorbers and argued that the nitrogen 
abundance is uncorrelated with the metallicity, which implies that nitrogen enrichment 
has several sources. They also claimed to observe shifts in the Mg~ii A2796, 2803 lines 
which they ascribe to enrichment of the heavy isotopes relative to terrestrial abundances. 

From the arguments above, it appears that the observational situation concerning F at high 
redshift is uncertain. There are no stringent, independent observations which constrain F 
in our sample. We therefore treat F as unknown and explore what happens if we vary it. 

We first consider F < Ft. To place an upper limit on the effect of F < F^, we refit all 
the VLT absorbers with no ^^Mg or ^^Mg, and similarly re-fit the absorbers in Murphy 
et al. (2004) using no ^^Mg or ^^Mg. We give the parameters for the fits to the Keck, VLT 
and combined samples in this situation in table 5.5. The confidence regions on the dipole 
location are shown in figure 5.12. Importantly, the dipole model remains statistically 
significant at the 3.5a level. The reduction in significance from 4. la is primarily due to 
extra scatter introduced into the Aa/a values about the model. The extra scatter implies 
that the F = model is not a good model for the absorbers. Additionally, the monopole 
becomes statistically significant at the 5.7a level. Thus, a lower heavy isotope abundance 
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in the quasar absorbers is unable to explain the dipole effect, and additionally increases 
the significance of the monopole term. The increase in significance of the monopole term 
mirrors the result in Murphy et al. (2003a). 

Table 5.5: Effect of removing ^^Mg and ^^Mg isotopes on the model Aa/a = A cos(0)+m 
for the VLT, Keck and combined samples. Generally speaking the effect is to push Aa/a 
to more negative values. The column "(5^" gives la confidence limits on A. The column 
labelled "significance" gives the significance of the dipole+monopole model with respect 
to the monopole model. This also introduces extra scatter into the Aa /a values about the 
dipole model, which implies (unsurprisingly) that fits with no heavy Mg isotopes are not 
a good representation of the absorbers. Despite the extra scatter, the dipole model is still 
significant at the 3.5a level. Additionally, the monopole term here becomes significant at 
the 5.7a level, cxrand is given for the different samples; HC refers to the Keck high-contrast 
sample. 



Sample 


O-rand (10"") 


A (10-") 


SA (10"") 


RA (hr) 


dec (°) 


m (10"") 


sig 


VLT 


1.04 


1.20 


[0.78,1.75] 


18.1 ± 1.4 


-65 ± 14 


-0.439 ±0.197 


1.9cr 


Keck 


1.63 for HC 


0.42 


[0.32,0.88] 


16.6 ± 2.2 


-35 ± 35 


-0.835 ±0.156 


0.4cr 


Keck+VLT 


As above 


0.98 


[0.77,1.23] 


17.3 ± 1.0 


-59± 10 


-0.528 ±0.092 


3.5cr 




Right Ascension (hours) 



Figure 5.12: la confidence regions for the Keck (green), VLT (blue) and combined 
(red) dipoles in the circumstance where absorbers containing Mg are fitted with no ^^Mg 
or ^^Mg, to mimic the maximum possible effect of a lower heavy isotope fraction in the 
quasar absorbers compared to terrestrial values. Although the confidence regions are 
enlarged as a result of extra scatter introduced into the data, the reasonable alignment 
between the samples is still maintained. The separation between the Keck and VLT dipole 
vectors increases to 32°, which has a chance probability of 11 percent. 

We now consider the impact of increasing the Mg heavy isotope fraction (F > Ff). In 
section 4-6.8, we discussed the presence of a low-z monopole in both samples, where 
the difference between the two samples is remarkably small. Explaining this result via 
alterations to the Mg isotope abundance would require enrichment of the heavy isotope 
fraction relative to terrestrial values. If we assume that all of the z < 1.6 monopole is 
due to relative enrichment of the heavy Mg isotopes, we can extrapolate from the F = Ft 
and F = cases to estimate {Tz<:is) using a simple linear model. A linear model may be 
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used as the response of lS.a/a to changes in T is hnear (Murphy ct ah, 2004). This model 
assumes that the ratio of ^^Mg/^^Mg is fixed. For z < 1.6, m = (-0.390 ± 0.108) x 10"^ 
for r = r^, and m = (—0.884 it 0.115) x 10^^ for the case T = 0. Under our linear model, 
(rz<i.6) ~ 0.32 in order to make m = 0. If we take am = 0.108 x 10~^ as a representative 
error, this yields (rz<i.&) = 0.32 it 0.03. 

In summary, variations in the magnesium heavy isotope fraction have the potential to 
significantly impact the monopole component of the angular dipole + monopole model, 
but cannot explain angular variations in a. 

5-7 Summary 

In this chapter, we have explored potential systematic effects to determine whether they 
are able to cause the angular variation in a described in chapter 4. 

In section 5-2, we used VLT and Keck spectra of 7 quasars to investigate whether inter- 
telescope wavelength-dependent systematics exist which could manufacture the dipole ef- 
fect. Although we were unable to find a statistically significant common trend from six 
spectral pairs, we applied an estimate of the possible wavelength distortion to the VLT 
sample, and found that this reduced the statistical significance of the VLT-|-Keck dipole 
from 3.9cr^ to 3.1(7. Importantly, this does not destroy the good alignment between the 
VLT and Keck dipole directions, nor does it significantly affect the alignment between 
dipole models fitted to z < 1.6 and z > 1.6 sample cuts. We also investigated the signifi- 
cant distortion present in the 2206— 1958/J220852— 194359 spectral pair. We showed that 
a distortion of this type cannot apply to the whole sample. From a combined analysis of 
all 7 quasars, we conclude that it is unlikely that the combination of these two distortions 
in the appropriate proportions is present in the data. 

In section 5-4, we examined the potential impact of the intra-order wavelength distortions 
found by Whitmore et al. (2010), and concluded that they are unable to explain the 
variation in a observed. 

In section 5-6 we explored the effect of variations in the Mg heavy isotope fraction, and 
showed that these are unable to explain the observed dipole effect, but could explain the 
apparent z < 1.6 monopole in both the Keck and VLT data if the quasar absorbers display 
an enriched heavy Mg isotope fraction relative to terrestrial values. 

We are thus unable to find any systematic effect which can explain the observed angular 
variation in a. 

We cannot conclusively exclude the possibility that the detected angular variation in a 
is the result of some unknown combination of systematic effects. In section 4-6.7 we 
^Calculated using a reference set. 



188 



5-7. Summary 



189 



showed that the chance probabihty of getting as good ahgnnient as seen between the 
dipole vectors in both low- and high-redshift sample cuts and between the Keck and VLT 
samples is 0.1 percent (Ri 3.3a). Thus, even if it is supposed that the Keck Aa/a 
results are systematically shifted to more negative values through some unknown effect, 
and the VLT sample shows no statistically significant variation in a, one is still left with 
a significant coincidence, or a conspiracy of subtle systematic effects, or some unknown 
systematic effect in both telescopes which is significantly correlated with sky position. 
Future observations using a different telescope will help to rule out telescope-dependent 
systematic effects, although if a systematic effect which knows about declination (rather 
than zenith angle) exists which is common to multiple telescopes it will be difficult to 
discover. We are unaware of any mechanism which would cause Aa/a to be specifically 
correlated with declination in the same way in both telescopes. 
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Further discussion on /i and ol 



In this chapter we draw the work on /i and a from chapters 3 and 4 together, to discuss 
impHcations which arise from the joint consideration of both sets of results. We also 
discuss measurements of other dimensionless ratios. 

6-1 /i and a — what have we learned? 

At first glance, chapter 3 seems to suggest that A///// = 0. Certainly the low-z ammonia 
results are extremely consistent with with no change in /i. However, chapter 4 seems to 
reveal significant evidence for spatial variations in a. There are several ways of interpreting 
these results together: 

1. The /U results are correct, and the a results are instead the result of some unknown 
systematic. Assuming that the /i results are correctly described by a weighted mean, 
then |A^/^| < 3 x 10^^ and |Aa/a| < 10~^. The MM method is relatively resistant 
to systematic effects. The thorough investigation into potential systematic effects 
by Murphy et al. (2003a) was unable to find any systematic which could explain the 
Keck results. Similarly, the investigations in chapter 5 are also unable to eliminate 
variation in a. This explanation is possible, but unlikely. 

2. The \i results are incorrect, and the a results are correct. It seems unlikely, al- 
though possible, that the ammonia results are incorrect given the good wavelength 
calibration in the radio regime. Similarly, the II2 results are relatively resistant to 
systematics on account of the large number of transitions used. An explicit analysis 
of potential systematic errors for Q0528— 250 shows that they are relatively small. 
On the whole, this possibility also seems unlikely. 

3. The [1 results are correct and the a results are correct. We note immediately that 
this possibility implies that |A/i/;u| < |Aa/a|. This is in conflict with the predictions 
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given in section 1-4.2 (which predict that |A/u//x| ~ 35|Aa/a| from many types of 
theories). Theory, of course, must be guided by the data. We note that it is not 
known what the correct model for grand unification is (or even if one exists), and 
so theories which predict relationships between A^u/^u and Aa/a must currently be 
characterised as speculative. Thus, any conflict between theories and experiments at 
present tends to argue against those theories rather than against the experiments. 
If both sets of results are correct, this immediately suggests that we should be 
concerned with the possible spatial variation of /i. 

6-1.1 Is there spatial variation in /x? 

If a varies spatially, then it seems natural to allow for spatial variation of /i as well. It 
would seem natural that the spatial variation of a should be tied to the spatial variation of 
fi, although this may not be the case. In this circumstance, we can try to look for a dipole 
in the data for Afi/fi, although clearly the small number of results makes this difficult. 

In figure 6.1 we show the extragalactic constraints on A///// from figure 3.17, but instead 
plotted against angle from the Aa/a r-dipole model from section 4-7.2. The two ammonia 
constraints at z < 1, which are located closer to the Aa/a dipole poles than the equator 
(O = 90°), seem to suggest that there is no angular variation in fj,. However, the dipole 
in Aa/a manifests mostly at higher redshifts, and therefore we would naturally expect 
ammonia results to show much less variation than the II2 results. 

From figure 6.1 it is immediately clear that the II2 results are clustered near the Aa/a 
dipole equator, and therefore sensitivity to any variation in fi (if it obeys a similar dipole 
relationship) should be reduced. Nevertheless, the II2 data could be consistent with a 
dipole having the same direction as the Aa/a dipole. The Q0528— 250 points, which are 
numerically closest to zero, also lie very close to the equatorial region of the Aa/a dipole, 
where no variation would be expected. 

To investigate a //-dipole model explicitly, we apply a dipole model to the H2 data. 
For an angle-only model {A^/fi = AcosQ-\-m), we obtain: A = (2.8 ib 1.0) x 10~^, 
RA = (20.0 ± 4.6) hr, dec = (-69 ± 8)° and m = (-0.43 ± 0.90) x 10-^ with xl = 0.09. 
Here, we give the error on A simply as the analytic standard error given the small sample 
size. This difference between this dipole vector and that from the same model fitted to 
the a data is 18°. For a distance-dependent model {Afi/fi = Br cosQ + m, where r is 
the lookback time distance to the absorbers), we obtain: = (2.6 it 1.0) x 10~^ GLyr~^, 
RA = (18.7 ± 5.7) hr, dec = (-68 ± 5)° and m = (-0.27 ± 0.77) x 10~^ Clearly the in- 
terpretation of these results is hampered by the small sample size; with a 4-parameter 
model, the fit only has a single degree of freedom. Nevertheless, it is certainly intriguing 
that the fitted dipole points in a very similar direction to the Aa/a dipole. We show the 
results of the fit to Afi/fi = Acos{@) -|- m in figure 6.2. One can calculate the statistical 
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Figure 6.1: Extragalactic values of Afi/iJ, vs angle from the Aa/a dipole. The circles 
show the four H2 results from chapter 3, the square shows the H2 result from J2123— 0050 
produced by Malec et al. (2010) and the two triangles show the z < 1 results obtained 
from the inversion transitions of NH3 by Murphy ct al. (2008a) and Hcnkcl ct al. (2009). 
The two Q0528— 250 points are displayed slightly offset with respect to each other for 
clarity. 



significance of the dipole model over the monopole-only model using the method of Cooke 
&: Lynden-Bell (2010), to obtain ~ 2a. However, given the small sample size we believe 
that the interpretation of this value is extremely limited. 

Including the two ammonia constraints for a r-dipole model yields the results i? = (1.5 it 
0.5) X 10-6 GLyr-\ RA = (23.8 ± 0.3) hr, dec = (-20±4)° and m = (-0.31±0.13) x 10-^ 
with Xu — 0.36. The two ammonia constraints place very tight restrictions on the location 
of a dipole model of this form, giving the very precise constraints on the dipole location. 
Obviously these values are conditional on the correct specification of the model, which in 
this case is far from certain. As for the H2-only results, the limited sample size impairs 
interpretation of these numbers. Nevertheless, inclusion of the ammonia results destroys 
the good alignment seen between the fitted A/x//x dipole and the Aa/a dipole. 

The H2-only results here are suggestive, but far from conclusive. To determine whether 
spatial variation exists in /i, a much larger sample of measurements of Afi/fj, will be needed 
at high redshifts, preferably at z > 2. If spatial variation in /i does exist, and it occurs in 
tandem with variation in a such that the relationship Afj,/fi = R{Aa/a) holds, then the 
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Figure 6.2: A/i/^u dipole model fitted to the H2 data, where A/x//x = 74cos(0) + m. 
The horizontal dotted line shows the value of the monopole. The confidence limits (blue, 
dashed lines) are derived from the covariance matrix at the optimisation solution, and 
should be regarded as extremely approximate given that the fit only has a single degree of 
freedom. 



combination of the H2 fi data and the a data would suggest that i? ~ 3, which contradicts 
the i? ~ 30 to 40 predictions made under various GUT and string-type models. If all these 
data are correct, then we have ruled out an apparently quite large range of unification 
theories. 

Our results for Q0528— 250 yielded a statistical precision for A/i/^ of ~ 4 x 10~^. If a 
dipole exists in Afi/fi, and it has an amplitude of ~ 3 at redshifts of 2 < z < 3, then 
a H2 absorber yielding the same precision on Afi/^ as Q0528— 250 near the dipole axis 
might detect deviation from A^u/^u = at the ~ 80" level. Although II2 absorbers are hard 
to detect, this line of argument strongly suggests that future searches for II2 absorbers 
should preferentially target DLAs near the Aa/a pole and antipole. 



6-1.2 Is spatial variation of a consistent with experimental constraints? 

An immediate concern of the results of chapter 4 is whether the results are consistent with 
other experimental constraints on variation of a. For instance. Murphy et al. (2003a) 
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found for the Keck results that, if one assumes that the rate of change of a is constant 
with time, then a/a = (6.40ibl.35) x 10"^^ yr^^, which is in conflict with the atomic clock 
constraints of Rosenband et al. (2008) [a/a = (—5.3 ± 7.9) x 10~^''yr~^] by an order of 
magnitude. The conflict between these two results requires that: i) the Keck results are 
wrong, or; ii) the atomic clock results are wrong (which seems extremely unlikely), or; Hi) 
the variation with time is not linear, or; iv) the Keck results are, at least in part, explained 
by spatial variation. We would note that there is no known model which predicts that 
the variation in a should be linear with time in the redshift range encompassing both 
the quasar-derived results and the present-day constraints, and so the importance of this 
apparent conflict is relatively low. The results of chapter 4 also point to spatial variation as 
the path to resolving this apparent conflict. Berengut & Flambaum (2010) have compared 
the results of that chapter to existing evidence, and found that our Aa/a results are 
consistent with all other experiments. We summarise their analysis here briefly. 

Atomic clock constraints should be able to detect spatial variation of a given sufficient 
precision. Berengut &: Flambaum considered the motion of the Earth with respect to the 
dipole axis; because Earth's motion is not orthogonal to the dipole axis, atomic clocks 
should in principle be able to detect the spatial variation described in chapter 4 as a 
slow drift in a as a result of the Solar System's motion with respect to the dipole axis, 
with an annual modulation due to the Earth's orbit around the Sun. The drift would be 
seen as d/aiab = 1-35 x 10~^^ cos '0 yr""*^, where tp defines the angle of the Solar System's 
motion with respect to the dipole axis. From the results in chapter 4, they calculated 
that ip = 0.07. They also calculate that the annual modulation will have an amplitude 
5a/a « 1.4 x 10^^*^. Given the current best constraint on Aa/a from atomic clocks by 
Rosenband et al. (2008) at the 10~^^yr~^ level, this implies that atomic clocks will need 
to improve by at least two orders of magnitude to detect a spatial variation of this sort. 
Berengut & Flambaum noted the rapid improvement in the precision of atomic clocks, 
and suggested that this precision may be achievable. 

We noted in section 1-6.4 that the Oklo natural nuclear reactor is sensitive to changes in 
a, although changes in Xq dominate the change in the resonance level. One can assume 
that AXq/Xq = to obtain the maximal possible constraints on Aa/a for comparison 
with our result. Berengut & Flambaum estimated the distance travelled by the Milky Way 
since the operation of the Oklo reactor, about 1.8 billion years ago, to be ~ 3 X 10^ light 
years. They noted that, based on the results of chapter 4, this implies Aa/a ~ 10~^ across 
this distance. However, Gould et al. (2006) (for instance) claimed — 1.1 x 10^^ < Aa/a < 
2.4 X 10~^ under the assumption that AXq/Xq = 0, which is not sensitive enough by an 
order of magnitude. When Xq is allowed the vary, the constraint obviously worsens, and 
thus we can conclude that our Aa/a results are consistent with constraints from Oklo. 

Berengut &: Flambaum also considered constraints from the /3-decay of ^^^Re to ^^'^Os 
obtained from meteorites. One can translate measurements of the abundance of these 
species into a constraint on the variation of a. This requires the assumption that the 
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weak coupling constant does not vary, 
measurements of the past decay of ^^''Re — 
the time since the meteorites were formed, 



however (Murphy, 2002; Uzan, 2003). The 
^^''Os constrain the average decay rate over 



1 Z"^* 

A = — / Xit)dt. (6.1) 

J now 

Berengut &: Flambaum 2010 concluded from measurements of A (Smoliar et al., 1996) and 
Anow (Galeazzi et al., 2001) that the current experimental constraints on (A — Anow)/Anow 
are at the 10~^ level, whilst the results of chapter 4 imply variation at the ~10~^ level. 
Thus, our results are consistent with the meteorite results. 



6-1.3 Other observational tests for spatial variation in a 

Spatial variation of a should in principle leave an imprint on the CMB; Sigurdson et al. 
(2003) consider this explicitly. Not only is the mean power spectrum modified, but spatial 
variations in a induce "higher order (non-Gaussian) temperature and polarization corre- 
lations in the CMB" (Sigurdson et al., 2003). Unfortunately, CMB constraints on Aa/a 
are currently only at the ~ percent level. Depending on the mechanism of a variation 
and how it scales with distance, the level of precision in CMB measurements required to 
confirm spatial variation in a suggested by our distance models in chapter 4 is unclear. 



6-1.4 The size of the habitable universe 

Traditionally, asking questions about what might lie beyond the observable universe has 
been considered metaphysics, as much of the discussion which follows is inevitably non- 
falsifiable. Nonetheless, the 7- year Wilkinson Microwave Anisotropy Probe (WMAP) re- 
sults give the spatial curvature, Ofc, as flk = — 0.080^o;og3, where the slight preference for 
a closed model results from a degeneracy with the Hubble constant (Larson et al., 2010). 
Imposing further constraints on Hq through local distance scale measurements and adding 
in the baryon acoustic oscillation (BAO) data yields $7^ = — 0.0023]'^QQQ5g (Komatsu et al., 
2010), which is extremely consistent with a flat and by implication infinite universe, un- 
less the universe has non-trivial topology (e.g. dodecahedral, see Luminet et al. (2003); 
c.f. Cornish et al. (2004)). 

Recall the discussion of the triple-a process from section 1-2.2.1, which suggests that the 
fine-structure constant cannot vary by more than a few percent if we are to produce appre- 
ciable quantities of ^^C or ^^O. Under the extremely strong assumption that something 
like local abundances of carbon and oxygen are required for life (or at least, for carbon 
based life), we can ask the question: how big is the habitable universe? If we cannot 
observe spatial gradients in the fundamental constants, this question is difficult to answer. 
However, observations of a spatial gradient in the fine-structure constant would allow one 
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to start to speculate on an answer to this question (we agree that this is probably non- 
falsifiable, but is extremely interesting nonetheless). In the presence of a spatial dipole, 
one can make the strong assumption that the dipole amplitude grows linearly along the 
dipole axis and extrapolate until one is off the triple-a resonance. Obviously, in the di- 
rections orthogonal to the dipole axis there is no constraint via this mechanism, but one 
can easily take the size obtained from the extrapolation as a lower limit boundary in all 
directions. We neglect other effects induced by the variation of a for this simple example. 

In chapter 4 we presented evidence for a dipole in a that is larger at larger distances, with 
an amplitude of A = 1.1 x 10~^ GLyr~^ under the assumption that the effect grows linearly 
with lookback time. In any event, the quasar data probe most of the size of the observable 
universe, so to a first approximation a changes by about 1 part in 10^ along the radius of 
the observable universe toward the pole of the dipole. If we assume a conservative figure 
that changing a more than one percent makes carbon-based, oxygen-respiring life much 
less probable, then under the argument outlined the radius of the habitable universe is 
about 1000 times the radius of the observable universe. Taking a sphere of this size as a 
lower limit under the argument in the previous paragraph gives that there are about ~ 10^ 
observable universe volumes in the habitable universe — an extremely large number. 

Although this estimate is extremely rough, it is a demonstration of how the variation 
of fundamental constants might suggest something to us about the region beyond our 
observable universe, which otherwise is. . . unobservable. For those who are concerned that 
our observable universe does not give enough room for life other than humans to emerge 
by chance (a notion that we do not subscribe to), the extra nine orders of magnitude from 
this calculation might give pause to reconsider the possibility that there is life out there, 
somewhere. 

6-1.5 Implications for physics 

The dipolar variation in a presented in chapter 4, if confirmed, would be a demonstration 
of new physics at the most fundamental level. Importantly, it would directly demonstrate 
the incompleteness of the Standard Model, which makes no allowance for spatial variation 
in the fundamental constants. Additionally, it would demonstrate that the Einstein Equiv- 
alence Principle is violated. The combined impact on the Standard Model and General 
Relativity may assist in attempts to unify these two pillars of twentieth century physics; 
this unification of these theories is a problem for which a definitive solution has proved 
elusive over the last few decades. 

Confirmed spatial variation of the fundamental constants would demonstrate the existence 
of a preferred frame in the universe, which has significant implications for cosmology. We 
explore some other claims for cosmological anisotropy in the next section. 
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6-2 Other evidence for dipoles Sz. a preferred cosmological 
direction 

The existence of a dipole in a would constitute a preferred direction in the universe. 
The natural question to ask is: can such an effect be seen in other data? The rcosO 
dipole described in chapter 4 points in the direction (RA, dec) ~ (17.5hr, —62°) which 
is approximately {l,b) = (330°,— 15°) in galactic coordinates. Below we consider other 
searches for preferred axes in the literature. 

6-2.1 Bulk flows 

There have recently been claims for large scale bulk motions in the universe. Kashhnsky 
et al. (2008) (c.f. Kashlinsky et al., 2009) presented the results of such an analysis. They 
used the Sunyaev-Zel'dovich effect (SZ effect) (Sunyaev & Zeldovich, 1980; Birkinshaw, 
1999) to measure the line-of-sight peculiar velocity of clusters of galaxies in their own frame 
of reference; the kinematic SZ effect is independent of redshift (Kashlinsky et al., 2009). 
For single cluster measurements uncertainties are large — of the order > 1000 km s~^ per 
cluster (Kashlinsky et al., 2009). However, with a sufficient number of clusters and modern 
CMB measurements it is in principle possible to determine whether a bulk flow exists. 
Kashlinsky et al. analysed ~ 700 X-ray clusters out to redshift z ~ 0.3 and the three- 
year WMAP data and found evidence for a bulk flow with amplitude of > 600kms~^in 
the direction {l,b) = (283°, 11°) ± 14° (Kashlinsky ct al., 2008, 2009), or (RA,dec) = 
(10. 9h, —47°). This is approximately 50° from the dipole found in chapter 4 on the sky. 

The statistical significance of this result has been challenged by Kcisler (2009), who argued 
that the result is due to correlations between the CMB WMAP channels, and that the 
statistical significance is more properly characterised at 0.7a. However, Atrio-Barandela 
et al. (2010) considered the error budget for Kashlinsky et al. (2009) in detail and claimed 
that the statistical significance is in fact ^ 3 to 3.5a. They note that the methods used to 
compute uncertainty estimates have biases which cause errors to be over-predicted, and 
also that if the bulk flow measurement is indeed caused by a systematic error, it must 
"have a dipole pattern, correlate with X-ray luminosity and be present only at cluster 
positions". 

A less contentious measurement relates to the so-called Great Attractor (Lynden-Bell 
et al., 1988); there appears to be motion of the Local Group of galaxies towards a gravitic 
source of extremely high mass (M ~ 5.4 x lO^^Msun) in the direction of the Hydra/Cen- 
taurus constellations at (Z,6) = (307°, 9°). Further study attributed this to a significant 
overdensity of clusters in that direction (Raychaudhury, 1989; Scaramella et al., 1989). 
More recent study attributes 44% of the motion of the Local Group to the Great Attrac- 
tor, with much of the remainder being attributed to the Shapley Supercluster at about 
700 Mpc in that direction (Kocevski & EbeUng, 2006). 
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6-2.2 Supernovae type la 

Measurements of type la supernovae (SN la) have revealed that the expansion of the 
universe is accelerating (Riess et al., 1998; Perlmutter et al., 1999; Astier et al., 2006). 
The current interpretation of this phenomenon is the existence of a positive vacuum energy 
with negative pressure, which at present is presumed to be the cosmological constant A. 
The existence of anisotropy in the SN la data would imply anisotropic acceleration of 
the universe, yielding a clearly preferred cosmological axis. Cooke & Lynden-Bell (2010) 
used the Union compilation of SN la data to search for a dipolar anisotropy. They found 
a (non-significant) 14% it 12% increase in the acceleration toward {l,b) = (309°, 43°), 
corresponding to (RA, dec) = (13.2h, — 20°). This is approximately 60° from the dipole 
described in chapter 4. They note that this is 31° from the CMB dipole as seen from the 
sun, and only 17° as seen from the CMB frame of elliptical galaxies with v < 2000 km s~^, 
where the CMB dipole is in the direction (/,6) = (311°, 26°). 

6-2.3 CMB rings 

Kovetz et al. (2010) examined the CMB temperature map and looked for an axis around 
which "giant rings" exist, and find such an axis in the direction {l,b) = (276°,— 1°) at a 
significance of 2.8a". This corresponds to (RA, dec) = (9.6h, —53°). 

6-2.4 Primordial deuterium abundance 

We noted in section 1-2.2.2 that the abundance of ^Li is sensitive to variation of the 
fundamental constants. In fact, the abundance of all elements are sensitive to variation in 
the fundamental constants, with differing degrees of sensitivity. However ^He is measured 
only at z < 0.01, and ^Li only within our galaxy; only the deuterium abundance has been 
measured at sufficiently high redshifts that spatial variation in the fundamental constants 
might be probed (Berengut et al., 2010b). 

Berengut et al. (2010b) investigated the 7 constraints on the high-redshift deuterium 
abundance presented in Pettini et al. (2008) to see whether evidence for a dipole can be 
found. They conclude that the data do not support a dipole model over a monopole model 
on the basis of xti but note that if one fits a dipole that the direction, RA = (15.5 it 1.6) hr, 
dec = (—14 lb 51)°, is consistent with the results of chapter 4. In galactic coordinates this 
is (/,6) = (351°, 34°). 

6-2.5 Combined analysis 

Antoniou & Perivolaropoulos (2010) reviewed different results which search for a cosmo- 
logically preferred axis, and select six different types of observations: Sn la data (from 
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the Union 2 set), the CMB dipole, large scale velocity flows (from various techniques), the 
anomalous alignment of the CMB dipole, quadrupole and octopole moments^ and large 
scale alignment in quasar optical polarisation data. They gave the mean direction of the 
six axes considered as {l,b) = (277° ± 26°, 44° it 27°), which corresponds to (RA, dec) = 
(11. 6h, —15°). Under simulations they argue that the probability of obtaining alignment 
this good or better by chance is about 0.8%. Excluding the CMB measurements, the 
chance probability rises to about 7%. 

6-2.6 What does this mean? 

Table 6.1: Summary of some claims for cosmological anisotropy or preferred directions, 
given in galactic coordinates (/,6). 



Description 


/ (degrees) 


b (degrees) 


r-dipole from chapter 4 


330 


-15 


Kashlinsky ct al. (2008) bulk flow measurements 


283 


11 


Great Attractor 


307 


9 


Supernovae type la 


309 


43 


Primordial deuterium abundance 


351 


34 


CMB rings 


276 


-1 


CMB dipole^ 


264 


48 


CMB quadrupole^ 


240 


63 


CMB octopole^ 


308 


63 



See references in Antoniou & Perivolaropoulos (2010). 



We give a summary of the results described above in table 6.1. From these results, it seems 
reasonable to conclude that the alignment between these various measures of anisotropy 
is suspicious, but far from conclusive. The alignment between these phenomena may be 
due to chance, or there may be some common cause. Another possibility is that common 
systematics exist. Certainly, several of these phenomena rely on CMB measurements, and 
thus common-mode systematics here would be unsurprising, but this fails to explain, for 
instance, the reasonable alignment with the (non-significant) SN la dipole. Ultimately, 
further investigation is required to determine the importance of these phenomena, and 
whether they they are related to the a dipole. 

6-3 Other dimensionless ratios 
6-3.1 AGn/Gn 

Here, Gn is the Newtonian gravitational constant. Measuring Gjv is difficult due to the 
fact that gravity is weak compared to the other three known forces. Nevertheless, it is cer- 
tainly possible to probe AGn/Gn to better precision than our knowledge of Gat. Clearly 
^This is the so-called "axis of evil" (Land & Magueijo, 2005, 2007). 
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Gjq is not dimensionless; with [Gat] = kg^^ni'^s^^. A more appropriate quantity to in- 
vestigate is a gravitational fine-structure constant, ag = Gj\[mp/{hc) (Moss et al., 2010). 
Note that Ug ~ 6 x 10^^^, emphasising the weakness of gravity relative to electromag- 
netism. The results given here are in terms of Gn, which therefore must be interpreted as 
assuming constancy of h, c and mp. Note that h, c and rrip can be used together to define 
a unit system with a unit mass of nip, a unit length of h/{mpc) and a unit time interval 
of h/ (nipC^). 

Big Bang nucleosynthesis yields the constraint that {AGn/GnI ^ 0.1 from one second after 
the Big Bang (Bambi ct al., 2005). Kaspi ct al. (1994) used biweekly timing observations 
of pulsar B1855-I-09 over a 7 year period to obtain Gn/Gn = (—9 ± 18) x 10^^^ yr^^. 
Williams et al. (2004) used the Lunar Laser Ranging experiment over a 30 year baseline 
to achieve Gn/Gn = (4 ± 9) x lO-^^yr-i. 

Bambi & Drago (2008) assumed that it may be possible to create a stable strange star from 
a neutron star progenitor. The transition from hadronic to quark matter should release 
extreme amounts of gamma ray energy in a short timescale. If Gat or Aqcd vary, then a 
sufficiently large variation would cause some neutron stars to transition to strange stars, 
thereby causing a gamma ray burst (GRB). Under two strong assumptions that strange 
or hybrid stars exist (and that not all compact stars are strange or hybrid stars), and that 
the transition from hadronic matter to quark matter is a first order transition, coupled 
with several auxiliary assumptions, they concluded from the observed rate of long GRBs 
that Gn/Gn < 10-^' yr-\ 

6-3.2 Combinations of constants 

By comparing transitions with a totally different mechanism of generation, one can con- 
strain various combinations of fundamental constants. For instance, comparison of milli- 
metre transitions in CO and optical fine-structure transitions constrains the quantity 
F = / [I. These dimensionless ratios often include gp, which is the proton gyromagnetic 
ratio. Here we present a selection of constraints on these combinations of fundamental 
constants. The precision which can be obtained with these combination of constants can 
be considerable, particularly given the high precision and accuracy of wavelength mea- 
surements in the radio domain. The downside for these measurements is that they require 
good absolute wavelength calibration over a potentially very large wavelength range; the 
many-multiplet method and the measurement of \i from H2 transitions requires only good 
relative wavelength calibrations. 

Detection of variation in one of these dimensionless ratios would be extremely interesting, 
but the interpretation would require multiple dimensionless ratios in order to relate the 
variation directly to variation in ^ or a. The most appropriate way of investigating these 
ratios would be to fit all of them simultaneously, thereby breaking the degeneracy between 
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the various fundamental constants. It seems rational to require that the directional de- 
pendence of the different constants should be the same, although other scenarios might be 
possible. We leave this task to future work. 

6-3.2.1 Ax/x, X = a'^gp/ fj. 

Velocity differences between H i 21cm absorption and optical transitions constrain x = 
a'^Qp/fj,. Srianand et al. (2010) used the recently detected absorption of 21cm and metal 
line absorption at z ^ 3.174 toward J133724-I-315254 to derive Ax/x = (— 1.7±1.5statisticaii 
0.6systematic) X 10^^. However, the ThAr calibration exposures were not taken immediately 
after the science exposures, and so there may be additional uncertainty introduced due to 
wavelength miscalibration. 

Kanekar et al. (2010b) also analysed H i 21cm and C i absorption at z ~ 1.36 and z ~ 1.56 
along the lines of sight to Q2237— Oil and Q0458— 020 respectively. They found that 
Ax/x = (+6.8 ± l.Ostat i 7. 7niax systematic) X 10~^. One Can translate this constraint into 
a prediction for Aa/a only if one has information about A/i//u and Agp/gp, as Ax/x = 
2{Aa/a) + (Agp/gp) - (Afi/fi). They use the results of King et al. (2008) (section 3-4) 
for A/i/yU, and concluded that the Keck Aa/a results are inconsistent with their findings 
and constraints on Afi/ fi unless fractional changes in gp are larger than those in a and /i. 

However, we note that both of these absorbers lie close to the equatorial region of the a 
dipole reported here (~ 85° and ~ 115° respectively). To compare our dipole model of a 
with the results of Kanekar et al. (2010b), we need a prediction for Afi/fi in the directions 
of Q2237— Oil and Q0458— 020, along with the unjustified assumption that Agp/gp. We 
are reluctant to apply a dipole model for Afi/fi given the small number of measurements 
available, and instead leave this question to be answered when a larger sample of A///// 
results becomes available. Similarly, we are reluctant to calculate a constraint on Agp/gp 
under the assumption that Afi/fi has no angular variation, given that we have detected 
apparent variation in a; if a varies across the sky, we cannot assume that /i does not. 

6-3.2.2 Ay/y, y = a^gp 

The ratio of 21cm absorption to molecular rotational absorption constrains the quantity 
y = a^gp. Murphy et al. (2001b) analysed the z = 0.2467 and z = 0.6847 absorbers toward 
PKS 1413+135 and TXS 0218+357 respectively. They gave Ay/y = (-0.20 + 0.44) x 10"^ 
and (—0.16 + 0.54) x 10"^ for the two systems respectively. 

6-3.2.3 AF/F, F = a^ / \x 

Levshakov et al. (2010) consider radial velocity differences between galactic sub-mm- and 
mm-wave transitions in ^^CO and the fine-structure transitions in C I toward a variety 
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of molecular clouds at different galactocentric distances, namely TMC-1, L183, Ceph B, 
Orion A/B and Cas A. They used existing radio data to constrain |Af | < llOms"^, which 
leads to \ AF/F\ < 3.7 x 10^'^, with F = c? j ^i. However, they noted that their results de- 
rive from statistical measurements which fit single-component Gaussians to profiles which, 
in some absorbers, display significant asymmetry. Although they give an M-estimate^ as a 
robust figure, the relatively low sample size (25 absorbers) means that there is likely to be 
residual bias in their estimate, and so their constraint is probably weaker by a moderate 
though unknown amount. Although Levshakov et al. did not fit an angular model to their 
results, they concluded that there is no spatial variation in F\ their definition of spatial 
variation is variation from terrestrial values. We note that their results for Orion A/B 
have 9 negative values of Au and 3 positive values. The chance of obtaining this many 
negative values if = based on a binomial estimate is about 7%. Thus, there may 
be residual systematics associated with the clouds, or perhaps galactic variations in F are 
very weakly indicated. A more robust method would be to fit an angular variation model 
to their results and determine whether angular variations in F exist. 



6-3.2.4 AJ/J, J = c/p(aV^)i-5^ 

Kanekar et al. (2005)^ compared the OH 18cm and H i 21cm lines from the z ~ 0.765 
gravitational lens toward PMN J0134— 0931, which constrain variation in J = gy(o? j ^^'^'^ . 
They report that /S.J / J = (0.44 it 0.36stat i l.Ogys) x 10~^, which is consistent with no 
change in J. We note that this absorber is ~ 95° from the pole of our dipole, and therefore 
minimal variation in a would be expected from our results. 



6-3.2.5 AG/G, G = gpia'^fi)^-^^ 

Kanekar et al. (2010a) have reported an observation of the satellite OH 18cm lines at 
z ~ 0.247 toward PKS 1413+135 (lookback time ~ 2.9Gyrs). By combining results from 
the Westerbork Synthesis Radio Telescope, and the Aricebo Telescope, they found that 
AG/G = (— 1.18±0.46) X 10~^ — a 2.6a detection. They noted that the conjugate nature 
of the absorption and emission lines provides a check on systematics, and by looking at 
the difference between the optical depth in absorption and emission found no evidence for 
systematic effects. We note that this absorber lies at ~ 85° from our dipole pole, and 
therefore should be expected to show minimal variation in a if our results are correct. 

^Sec section 4-4.8.2 for the definition of a M-estimate. 

■^Note that in their paper the quantity J is labelled F. We have renamed it to avoid conflict with F 
defined above. 
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6-4 Future avenues of research 
6-4.1 ^^^Th nucleus optical transition 

The ^^^Th nucleus has the lowest known excited state of any nuclear transition — a 
meagre 7.6 ± 0.5 eV above the ground state (Beck et al., 2007). The transition has not 
been measured directly, but instead determined from differences of many 7-transitions to 
the ground level and first excited state. The width of the level is estimated at 10~^ Hz 
(Tkalya et al., 2000), explaining lack of a direct detection. As the transition is extremely 
narrow, it in principle can be used as the standard for a high accuracy clock (Flambaum 
&: Berengut, 2009). The transition is in the UV spectrum, and therefore can be excited 
in principle with conventional lasers, although the experimental difficulties in exciting a 
nuclear transition are considerable. 

This transition appears to be extremely sensitive to a change in fundamental constants. 
A rough estimate by Flambaum & Berengut (2009) gives 



which implies that Aoj ~ 3 X 10^0 X (AXq/Xg) Hz. With a width of ~ 10"^ Hz, this implies 
that one could achieve sensitivities to AXg/Xq of about 1 part in lO^'* per year, which 
is about ten orders of magnitude better than the current constraints on the variation of 
Xq (Flambaum & Berengut, 2009). If such a clock could be built, the precision is several 
orders of magnitude better than needed to detect the spatial variation of a implied by the 
results of chapter 4; verification or refutation of these results would be extremely rapid. 

Rellergert et al. (2009) noted that, as the nucleus is well isolated from the general environ- 
ment, a thorium nuclear clock might be constructed in the solid state (crystal) environ- 
ment. Based on an analysis of the crystal environment, they conclude that one second of 
photon collection may yield a (systematic-limited) accuracy of A/// ~ 2 X 10"^^, which 
is comparable with the precision available from present atomic clock experiments over the 
course of a year. 
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Markov Chain Monte Carlo 
methods applied to Aa/a 

In this chapter, we set out to verify whether the Voigt profile fitting program VPFIT 
produces correct parameter estimates and uncertainties for particular models. We briefly 
present the theory behind VPFIT in order to demonstrate both how parameter values 
are estimated, and how uncertainties are derived. This also serves to demonstrate the 
multiple potential points of failure for an optimisation algorithm of this type. We then 
demonstrate the application of Markov Chain Monte Carlo (MCMC) methods to show 
that, in the context of simple Aa/a fits, the estimates of Aa/a produced by vpfit are 
good, and also that the associated uncertainties on Aa/a are reasonable. 

7-1 Introduction 
7-1.1 Motivation 

Chand et al. (2004) analysed 23 absorbers using VLT/UVES data, and reported Aa/a = 
(—0.06 lb 0.06) X 10~^, which appears to contradict the results of Murphy et al. (2004), 
with 143 absorbers. Although the VLT/UVES data are generally of higher signal-to-noise 
than the Keck data used by Murphy et al. (2004), the statistical precision reported by 
Chand et al. (2004) seems to be too good when considering the differences in sample size. 

Chand ct al. (2004) modelled Aa/a as an external parameter to each fit, rather than 
including it as a free parameter in each fit as we have done. In this approach, one steps 
through values of Aa/a and determines that value which minimises x^. A significant 
disadvantage of this method is reduced speed, as one is not using gradient and curvature 
information of \^ with respect to Aa/a at a given point to locate the minimum. 
Nevertheless, a plot of vs Aa/a is instructive. Sufficiently near the x^ minimum, the 
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functional form of implies that a plot of vs /S.a/a should be approximately parabolic. 
The uncertainty on Aa / a can be determined by solving 

X^(Aa/amin + O-Aa/a) " X^(Aa/amm) = 1 (7.1) 

for (Ti^^i^ (Press et al., 1992), where Aa/amin is the value of /S.a/a which gives the 
minimum x^. In their paper, Chand et al. (2004) show plots of vs /S.a/a, which 
demonstrate fluctuations near the purported minimum that are much larger than unity. 
This implies not only that the minimisation algorithm is unlikely to have reached the 
true minimum, but also that the uncertainty on Aa/a has not been correctly determined. 
Murphy et al. (2007c) and Murphy et al. (2008b) considered these issues in more detail. 
Additionally, Murphy et al. (2008b) demonstrated that the statistical precisions quoted 
by Chand et al. (2004) exceed the theoretical maximum allowed by the spectral data and 
associated errors. This suggests that the results of Chand et al. (2004) are unreliable. 

Murphy (2002) demonstrated that, over an ensemble of simulated spectra, vpfit recovers 
the input value of Aa/o; on average, and that the mean \a uncertainty matches that 
expected from the introduced noise. This strongly suggests that vpfit is working correctly. 
However, this does not demonstrate that for a particular spectrum that vpfit produces 
good parameter estimates and uncertainties. Although the results of Murphy (2002) seem 
robust, the results of Chand et al. (2004) motivate us to attempt a direct demonstration 
that VPFIT is working as intended. Markov Chain Monte Carlo methods, described below, 
allow direct exploration of the likelihood function and parameter space considered in a 
reasonable amount of time, therefore allowing us to verify whether or not the output of 
VPFIT is good. 

7-1.2 Optimisation theory 

When fitting a model to data, x^ minimisation techniques are widely used, which minimise 
the quantity 

i=i ^« 

where /(x)j is the prediction for the model at the ith data point, yi is the observed value of 
the ith point, ai is the associated statistical uncertainty of that data point and the sum is 
over N points. The x^ statistic is a sufficient quantity to determine maximum likelihood 
parameter values and uncertainties in the case of Gaussian likelihood. A wide variety 
of methods are available to undertake this minimisation process. For linear functions, 
explicit solutions exist, but otherwise an iterative method must be applied from some 
starting guess at the parameters x. 

The most common methods utilised are Newton-type methods, which utilise a local 
parabolic approximation to x^ ^^nd then search towards the projected minimum by some 
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prescription. Sufficiently near a minimum, we expect tliat is reasonably well approxi- 
mated by a parabola. Using the Taylor series expansion at some point x yields 

+ p) ~ x'(x) + g^(x)p + 2P^G(x)p (7.3) 

where g is the vector of first order partial derivatives of with respect to the parameters 
(the gradient vector) and G is the matrix of second order partial derivatives (the Hessian 
matrix). If the parabolic approximation is exact, one can find the exact minimum by 
finding p which minimises 

= gT(x)p + ^p^G(x)p. (7.4) 

The stationary point (minimum) of this function can be obtained by solving the set of 
linear equations 

G(x)p = -g(x). (7.5) 

In practice, the parabolic approximation is unlikely to be exact, and thus one can search 
along the direction p for a lower value of (the Gauss-Newton method) (Gill et al., 
1986). Another possibility is to heuristically modify the Hessian matrix, G, to account for 
the imperfection of the approximation by scaling the diagonal of the Hessian by a factor 
(1 -|- A) for different trial values of A. The A = case gives the case of equation 7.5, whilst 
the limit A — ^ cxd gives the steepest descent method, which simply searches down the local 
gradient. As A — t- oo, the implied step size tends to zero (Gill ct al., 1986). As the search 
vector tends towards the local gradient descent direction, and the step size tends to zero, 
a lower point will always be found unless x is the solution or A is poorly chosen. This 
method is the Levenberg-Marquardt method (Marquardt, 1964; Press et al., 1992). One 
then iterates the chosen method until one cannot find a significantly lower point. For 
VPFIT, this leads to a stopping criteria that the fractional change in between iterations 
must be smaller than some user-specified tolerance. We have used ^Xtoi ~ 10~^- 

Previously, vpfit implemented the Gauss-Newton method, which in most circumstances 
works very well. However, we found that for the complicated molecular hydrogen fits, 
with thousands of free parameters, full convergence did not seem to occur. Investigations 
showed this to be due to a combination of the large number of parameters and the presence 
of many parameters that were only moderately well or poorly determined. We have 
modified vpfit to run a dual optimisation process to overcome this. At each iteration, 
VPFIT attempts to take both a Gauss-Newton step and a Levenberg-Marquardt step, and 
takes whichever of the steps produces a greater reduction in x^- We have found this 
algorithm to be very successful in producing apparent convergence even for thousands of 
free parameters. 

To implement either of the methods above, one requires knowledge of the first and second 
order partial derivatives of x^ with respect to all the parameters. For the estimation of 
parameters from quasar spectra, the model consists of a series of Voigt profiles. Unfor- 
tunately, not only is the Voigt function non-analytic, but some of the derivatives are also 
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non-analytic. As a result, the Voigt function is evaluated through numerical approxima- 
tions, whilst derivatives are calculated through the use of finite differencing methods on 
function values. The first order partial derivatives with respect to a parameter Xj are 
given by 



dxi 



N 



-2E 



[/(x), - y,] a/(x), 



"J i=i "I 

whilst the second order partial derivatives are given by 



dxi 



(7.6) 
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dxjdxk 



(7.7) 



For even moderate numbers of fitted components, the computational effort required to 
calculate the second order partial derivatives becomes severe. However, the second term 
in equation 7.7 contains the term [/(x)j — yi]. For a well-fitting model, with large numbers 
of degrees of freedom, we expect that this term has zero expectation value, as the model 
should predict /(x)j > y^ about as much as /(x)j < y^. Thus, when summed over a large 
number of data points, the second order partial derivatives of x^ should be dominated by 
the first term in equation 7.7. Approximating the second order partial derivatives with 
only the first term is known as the Gauss-Newton approximation, giving 

dx' ^,oy 1 g/(x).a/(x), ^ 
dxjdxk ~[ erf dxj dxk 

This approximation is much faster, as the second order partial derivatives can be approxi- 
mated through a combination of first order partials. This approximation tends to be good 
in practice, and is used by vpfit. Note that the Gauss-Newton approximation ensures 
that the Hessian matrix is positive definite. This is useful — if the Hessian matrix is 
positive definite, then the Gauss-Newton direction is guaranteed to be a descent direction 
(in the absence of numerical issues) (Press et al., 1992). However, numerical issues may 
serve to push p sufficiently far away from the true Gauss-Newton direction that it is no 
longer a descent vector (Gill et al., 1986). Alternatively, inadequate searching along the 
direction p may mean that a lower point is not found even when one exists. 

Once a purported solution is reached, 

C = G\ (7.9) 

gives the covariance matrix (Press et al., 1992). The square roots of the diagonal terms of 
the covariance matrix correspond to the estimated uncertainty on the various parameters 
(Fisher, 1958). 

Note that, depending on the spectra being fitted and number of modelled components, 
there may be many local minima for which do not represent ideals fit to the data, 
or for which the fit is reasonable but the values of some parameters are unlikely to be 
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physically plausible. This problem increases with increasing model complexity. Although 
in the future automated methods may supplant human interaction, at present good results 
are generally achieved faster through user-supplied starting guesses. 

7-1.3 What can go wrong? 

Although the minimisation theory given above seems clear cut, in practice the implemen- 
tation may be severely affected by certain issues. These include: general programming 
errors (bugs); errors in the calculation of the partial derivatives, and; ill-conditioning due 
to model mis-specification. 

7-1.3.1 Programming errors 

Although it is never a topic one likes to consider, the possibility remains that programming 
errors may cause the failure of an optimisation algorithm to converge. To the extent that 
such errors existed in vpfit, our MCMC algorithm would detect them (see in particular 
section 7-2.4). However, our MCMC code has been created as an add-on to vpfit, and 
therefore to the extent that they use the same Voigt profile code generation and other 
housekeeping routines, such errors will be common to our analysis. Thus, our MCMC code 
allows us to demonstrate whether or not failure of the optimisation algorithm has occurred, 
but does not allow us to confirm the absolute correctness of the solution. Nevertheless, 
Murphy (2002) generated spectra using independent Voigt profile code and found that 
VPFIT recovered the input values, suggesting that much of the "back end" of vpfit works 
correctly and that our Voigt profile generator is reliable. 

7-1.3.2 Partial derivatives of 

A key point of consideration is the calculation of the partial derivatives of with respect to 
various parameters. Consider a quasar which presents an unabsorbed continuum intensity 
loiv). If an absorbing cloud exists along the line of sight to Earth, the observed intensity, 
I(z^), is given by the convolution of the intrinsic spectrum and the instrumental profile 
(IP) of the observing instrument as 

/(i/) = $(H ®/o(i^)e~"(''\ 

where ^{61^) is the instrumental profile and r(zv) is the optical depth of absorption due to 
the intervening cloud, is determined by the observed profile, and therefore the partial 
derivatives of with respect to the parameters must account not only for the non-analytic 
nature of the Voigt function but also the instrumental profile. At present, the best way to 
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obtain the partial derivatives of with respect to the various parameters is using finite 
differencing methods. 

With finite differencing methods, one must choose parameter step sizes hj to evaluate the 
approximation 

dxj hj 

If h is too large, then the approximation will be poor because the step size is too large. If 
h is too small, the use of finite precision calculations will result in substantial amounts of 
cancellation, which will also render the approximation inaccurate. Poor choices of h will 
result not only in poor convergence but also in incorrect uncertainty estimates through 
propagation into the covariance matrix. If the accuracy to which is computed is e/, it 
can be shown that the optimal choice of hj is 

where [x^(x)]" is the curvature scale of the function (Press et al., 1992). To apply this 
formula, one would need knowledge of the derivatives of the convolved function at each 
pixel, which depends on the column density N , the dispersion parameter 6, the distance 
from the line centre and the instrumental resolution. Rather than attempting to thor- 
oughly investigate this 4D parameter space, it turns out that fixed parameter step sizes 
work reasonably well in most cases, provided that they are adequately chosen, vpfit uses 
fixed parameter step sizes which have been chosen through experimentation to yield good 
results, in that convergence seems to be reached and the solution does not seem to be 
unduly affected by reasonable perturbations to the starting guesses for parameters. In 
particular, we use hz = 10~^, /iiog^QAf = 0.005 (where N is in atoms per cm^) and h}, = 0.1 
km/s. Our experience indicated in particular that values of hh much smaller than this 
seemed to affect convergence. For /S.a/a we choose hi^^a/a = 10~^, which is always smaller 
than the statistical uncertainties we generate. It is extremely different to optimise an 
arbitrary function in general, and knowledge of the likely scale of the solution often yields 
insights into how to construct a successful optimiser; the various values of h above relate 
to the typical scale of the uncertainties on parameters in our Voigt profile models. 

7-1.3.3 Ill-conditioning 

Another possibility is that the model is overspecified ("over-fitting"). For an over-fitted 
model, the data will not discriminate adequately between some of the parameters, leading 
to space being relatively flat in relation to these parameters. This implies not only 
that strong covariances are likely between these parameters, but also that the parameter 
uncertainties will be large. This leads to two effects. Firstly, uncertainty estimates on 
some parameters may be larger than they need otherwise be (Gill et al., 1986). Perhaps 
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more importantly, this can seriously affect the convergence of the optimisation algorithm. 
This problem presents as ill-conditioning of the Hessian matrix (and therefore the equation 
7.5). The inverse of the Hessian matrix, can be written as = V-[diag(l/wj)] • U"^, 
where U and V are orthogonal square matrices (Press et al., 1992); this is the singular 
value decomposition (SVD) of G^^. The condition number of the Hessian matrix is 
defined as the ratio of the largest to smallest Wj. As the condition number increases, small 
perturbations in the inputs to equation 7.5 will lead to large variations in the solution. This 
tends to render the optimisation algorithm unstable. A useful heuristic is that the number 
of significant figures lost is equivalent to the base-10 logarithm of the condition number 
(Press et al., 1992); for double precision (with approximately 16 significant figures), a 
condition number of ~ 10^^ implies that no digits in the solution of equation 7.5 are 
correct. We observe that even for moderately simple problems, condition numbers of 
10^ to 10^ are common. For substantially overfitted problems, condition numbers can 
reach 10^^ or higher. Ill-conditioning can push the search direction for the Gauss-Newton 
method arbitrarily far from the true solution (Gill et al., 1986). This can lead to an 
inability to find a lower search direction, causing premature termination of the algorithm. 
The Levenberg-Marquardt algorithm is less susceptible to this problem — in the event 
of ill- conditioning, scaling the diagonal of the Hessian reduces the condition number by 
forcing the matrix to be diagonal-dominant. 

Note that, provided that G remains positive definite, an optimisation algorithm should 
continue to head downhill until a solution is reached, even if substantial alterations to G 
are made (Press et al., 1992). This implies that parameter uncertainties are much more 
likely to be affected by numerical problems than the parameter estimates themselves, 
although numerical instabilities of sufficient magnitude will also prevent convergence. 

7-1.4 Verification of the solution 

To address all the concerns above, one would like an independent method of verifying 
not only that the purported solution is a local minimum of x^) but also that parameter 
uncertainties are realistic. In principle, one could explicitly map out with respect to all 
the model parameters, either exhaustively or through traditional Monte Carlo methods to 
verify both the location of the minimum and the curvature. Unfortunately, this problem 
becomes exponentially difficult with the number of parameters — the so-called "curse of 
dimensionality". When modelling metal absorbers to investigate Aa/a, one typically needs 
a few to tens of components, with several different species, leading to a few to hundreds 
of parameters. When modelling A/i//i with H2, models can easily reach thousands of 
parameters. The number of parameters (high dimensionality) and the time taken to 
evaluate the Voigt function conspire to render traditional Monte Carlo methods useless. 
However, Markov Chain Monte Carlo methods can be applied successfully to problems of 
moderate dimensionality where traditional Monte Carlo methods cannot. 
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7-2 Overview of the Markov Chain Monte Carlo method 

A Markov chain is a series of points for which the next point can be generated only with 
knowledge of the current point. That is, any series of points which satisfies 



satisfies the Markov property (Liu, 2001). The essential idea of Markov Chain Monte 
Carlo (MCMC) methods is not to uniformly sample some volume within which the target 
probability distribution, Pr(x), is contained, but instead to construct a Markov chain 
such that the stationary distribution of the chain is the target distribution Pr(x). Each 
iteration of the chain yields a sample from Pr(x). Typically, one specifies a transition 
rule r(x,x') which proposes a new point, x', from the current point, x. It turns out that 
the combination of the Markov property and the detailed balance condition is sufficient to 
generate a chain whose stationary distribution is the target distribution (Metropolis et al., 
1953; Liu, 2001). The detailed balance condition requires that the probability of jumping 
from point a to point b is the same as jumping from point b to point a, or 



Any Markov chain which is irreducible (that is, there is a non-zero probability to move 
between any two points in the state space in a finite number of steps), aperiodic and 
possesses an invariant distribution will converge to the invariant distribution, vr. For 
the Metropolis algorithm (see below in section 7-2.3), this is almost surely true (Tierney, 
1994). Thus, even if the algorithm is started in a region of low likelihood, it will eventually 
converge to the desired distribution. We describe below in section 7-2.4 why we think our 
algorithm should correctly sample from the target distribution from the first iteration. 

Whilst naive (e.g. uniform) sampling of Pr(x) degrades exponentially with the number 
of parameters, one can construct MCMC algorithms which degrade only polynomially 
with the number of parameters^ — the primary advantage of MCMC methods. However, 
because each step in the chain depends on the previous step, successive steps will be 
correlated. The degree of correlation is difficult to predict a priori, as it depends on the 
number of parameters, the target distribution, the proposal distribution, and the degree to 
which the transition rule r(x, x') is well tuned to the target distribution. If the correlation 
is high, large numbers of steps will be required to obtain the equivalent of one independent 
sample. Therefore, the running time of MCMC algorithms is unknown at the start, and 
can only be determined by examining the chain as the algorithm progresses. 

^See section 7-2.5.2 for a justification of this. 



= /(Xj) 



(7.10) 



Pr(a)r(a,b) = Pr(b)r(b, a). 



(7.11) 
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7-2.1 Applications of MCMC methods 



MCMC methods have found wide appHcation in a number of fields. From an astrophys- 
ical perspective, they have been applied to determine posterior confidence regions from 
CMB data (for example, CosmoMC — Lewis & Bridle (2002), Slosar & Hobson (2003); 
Dunkley et al. (2005); Destri et al. (2008a,b)), for investigating CMB systematics (Gold 
et al., 2010), for CMB model selection (see review by Trotta, 2008), in exoplanet searches 
(Ford & Gregory, 2007; Balan & Lahav, 2009; Hrudkova et al., 2010), for analysis of 
exoplanet atmospheres (Madhusudhan &: Seager, 2010), for investigation of dark energy 
models (Bozek et al., 2008; Wang & Xu, 2010), for investigating galaxy formation models 
(Henriques et al., 2009; Lu et al., 2010), for investigating post-general relativity models 
and testing general relativity (Daniel et al., 2010; Lombriser et al., 2010), for analysing 
Supernova type la data (Gong et al., 2010), for analysis of potential gravity wave data 
from the Laser Interferometer Gravitational- Wave Observatory (LIGO) and other gravity 
wave observatories (Robinson et al., 2008; van dcr Sluys et al., 2009; Raymond et al., 
2009), and for analysis of gamma ray bursts (GRBs) (Gou et al., 2007). In applications 
which are directly relevant to the context of this work, Nakashima et al. (2008) applied 
MCMC methods to the 5-year WMAP data to constrain Aa/a as -0.028 < Aa/a < 0.026 
(using data from the Hubble Space Telescope as a prior). Similarly, Wu & Chen (2010) 
applied MCMC methods to the 5-year WMAP CMB data to constrain the change in the 
gravitational constant to be —0.083 < AGn/Gn < 0.095. They also place constraints on 
Brans-Dicke theories from the same data. Clearly the utility of MCMC methods is high, 
and research into improving the method is active. 



7-2.2 Aim of MCMC work 



Our aim is to verify that the purported solution of vpfit is good and that parameter 
uncertainties are reasonable by sampling from the likelihood function of our supplied 
model. Note that = ~21n(L(x)), where -L(x) is the likelihood function, up to an 
additive constant which can be neglected, as we only ever consider differences in for 
finding parameters. As such, we define the likelihood function here as 

L = e-^'/^. (7.12) 

As we typically have hundreds of degrees of freedom, a naive calculation of L will underflow 
in IEEE 754 floating point implementations. To remedy this, L can be calculated as 
L = e~-^^/^'^^min/'^, where Xmm the smallest value under consideration. As only 
ratios of likelihoods or ratios of sums of likelihoods are being considered, this extra factor 
will always cancel. Another option is simply to work in —2 In L, which avoids this problem. 
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7-2.3 Metropolis algorithm 

The Metropolis algorithm (MctropoHs et al., 1953) is perhaps the simplest MCMC algo- 
rithm. The Metropolis algorithm proposes a new position in parameter space, x', based 
on the current position, x, according to some proposal function r(x, x'). The only re- 
quirement imposed is that 

r(x,xO = r(x',x) (7.13) 

(that is, the proposal distribution is symmetric) (Liu, 2001). 

In principle, there area large number of possible proposal functions, T, although in practice 
the most common choice is a multidimensional Gaussian centred on the current point (Liu, 
2001), such that 

x' = x + <7Af(0,S), (7.14) 

where X) is the covariance matrix obtained from the optimisation algorithm at the pur- 
ported best-fit solution, and gisa, scalar tuning factor. Note that the choice of T influences 
only the efficiency of the algorithm, not the formal correctness of the solution (Metropolis 
et al., 1953; Tierney, 1994). To the extent that the estimated covariance matrix is not 
a good approximation to the true covariance matrix, the algorithm's performance will 
degrade. 

The Metropolis algorithm generates a sequence of points, {x*}, according to a two step 
prescription. First, from the current point, x, propose a new point, x', via T(x, x'). Then, 
calculate the ratio q = L(x')/L(x). Second, with probability min(g, 1) move to the new 
point (i.e. set x*+^ = x'). Otherwise, retain the current point i.e. x*^"*^ = x*. In this 
fashion, proposed moves to a point which is more likely than the existing point are always 
accepted, whereas moves to a point which is less likely than the existing point are some- 
times accepted, depending on the ratio of the likelihoods. For a sufficiently large number 
of iterations, the distribution of {x*} will sample from the underlying probability distri- 
bution up to a normalisation constant. The probability distribution of each parameter 
is approximated by the distribution of that parameter in the chain, {x*}. The algorithm 
will spend most of its time in regions of high likelihood, and little time in regions of low 
likelihood. It is for this reason that MCMC outperforms traditional Monte Carlo methods 
in high-dimensional parameters spaces — for high-dimensional parameters spaces, most of 
the hypervolume is located away from the region of interest, and therefore uniformly sam- 
pling from a region will generally sample in regions of low likelihood, whereas for MCMC 
samples are necessarily concentrated near regions of high likelihood. The algorithm must 
be tuned to ensure reasonable running times — this is described below in section 7-2.5. 
Note that moving to the new point only if g > 1 turns the Metropolis algorithm into a 
stochastic optimiser. 

For reasons described below in section 7-2.7, we implement a variant of the Metropolis 
algorithm known as the Multiple- Try Metropolis method. 
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The Metropolis-Hastings algorithm (Hastings, 1970) generalises the Metropolis algorithm 
to allow non-symmetric proposal functions. However, in most cases it is not clear why 
a non-symmetric proposal function should outperform a symmetric one. In any event, 
we expect that the our likelihood function should be approximately symmetric near the 
likelihood maximum, in which case a symmetric distribution function seems reasonable. 

7-2.4 Convergence sampling efficiency 

There are two key concerns for MCMC algorithms — converging to the target distribution 
from the initialisation point (reaching stationarity) , and obtaining sufficient numbers of 
samples when stationary. 

In our case, the ffist concern relates to the fact that the algorithm must start near the 
likelihood maximum. If the algorithm is started away from the likelihood maximum, it 
will eventually converge to the stationary distribution, but the time required for this is 
unknown. Stationarity can be determined by inspecting the chain of samples to see if the 
long term average of parameters differs significantly from their starting values. Standard 
practice is to discard a certain number of samples from the start of the chain to allow 
for "burn-in" (the exact number must be determined from the observed behaviour of 
the chain). However, we start the algorithm with parameters set to those which give a 
purported optimal solution from the VPFIT optimisation. In this case, the parameters 
should already be at or near the maximum likelihood position, and so burn-in should be 
unnecessary. This assumption can be verified by inspection of the chain, and we find that 
our parameter values do not wander appreciably from their starting values. 

The second concern above relates to the fact that successive samples are correlated. For 
a traditional Monte Carlo estimator, the precision of the mean of a set of samples is 
trivially given by cr^i = (Ti/\/iV, where at is the standard deviation of the samples for the 
parameter of interest. If we define pj as the lag-j autocorrelation of the MCMC chain for 
some parameter (i.e. p = corr[{xJ}, {xl"*"^}]), then Liu (2001) gives 

N 

If we define the integrated autocorrelation time as 

1 ^ 

then 

The quantity N/2t is commonly known as the effective sample size (Liu, 2001). Typically, 
T for even simple cases we consider may be of order ~ 10^, meaning that many samples 
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are required to obtain the equivalent of a single independent sample. Equivalently, one 
needs 77 = 2r samples to obtain the equivalent of one independent sample. Although 
the presence of autocorrelation increases the running time substantially, this problem is 
generally outweighed by the ability to adequately sample the likelihood region of interest. 

By calculating r, one obtains a quantitative measure of the convergence of the chain. Pro- 
posal functions which are poorly tuned to the target distribution will eventually generate 
sufficient numbers of samples from the target distribution, but this can take a prohibitively 
long time. We deem a final run acceptable if r is much smaller than the chain length. The 
ideal situation is r ~ 1, in which case the chain will look like noise. The ideal circumstance 
is where the chain appears stochastically invariant under random reordering of the chain 
values. We describe a chain where r ^ as well-mixed. 

7-2.5 Speeding convergence & reducing run-time 
7-2.5.1 Acceptance rate 

Let us define the acceptance rate for a large number of steps as the ratio of the number of 
accepted steps to the number of steps attempted. If the algorithm takes steps which are 
generally much larger than the scale of the target distribution, then the acceptance rate 
will be ~ 0%, and the parameters will rarely change, leading to high autocorrelations and 
therefore low sampling efficiency. On the other hand, if the algorithm takes steps which 
are generally much smaller than the scale of the target distribution, then the acceptance 
rate will be ~ 100%, but it will take a long time to fully explore the distribution. In this 
case, the parameters display random-walk behaviour. It turns out that if both the target 
and proposal distributions are Gaussian then the ideal acceptance rate in 1 dimension is 
44% (Gelman et al., 1995), in the sense that this acceptance rate produces the smallest 
autocorrelation time for the chain. It appears that if the acceptance rate is slightly too 
low then efficiency is not too adversely affected, whilst slight increases in the acceptance 
rate seem to confer much worse performance penalties (Liu, 2001). Although naively we 
might expect that an acceptance rate of about 50% is ideal also in higher dimensions, 
it surprisingly turns out that the optimal rate for k dimensions as — t- 00 is 23.4% 
(Roberts et al., 1997). Understanding the long-run acceptance rate requires large numbers 
of samples. For our algorithm to be tuned such that the acceptance rate was close to 23% 
requires large amounts of time. As such, we attempt to tune our algorithm such that the 
acceptance rate is between 15% and 40%, and find that this rule works well in our cases. 
The actual tuning is achieved by modifying g. 

7-2.5.2 Tuning g 

Consider a /c-dimensional Gaussian proposal function (equation 7.14) with a diagonal 
covariance matrix where all entries on the diagonal are unity, and a target distribution 



216 



7-2. Overview of the Markov Chain Monte Carlo method 



217 



consisting of a Gaussian with some set of parameters and the same covariance matrix. 
The probabiHty of moving a radial distance r is related to the x distribution and is given 

by 

P{r) oc r''~^e~''/'^. (7.18) 

The r^~^ term arises from the fact that the volume element d'^x has a radial term of r'^"^ 
in hyperspherical coordinates of appropriate dimension. This distribution is peaked at 
r = \/k — 1, and so the most common step proposed will have length \/k — 1. However, 
the target distribution only has a typical width of ~ 1 along any radial slice. This means 
that for large k most steps will land far from the likelihood maximum, meaning that 
almost all steps will be rejected. This implies that we must scale the covariance matrix 
of the trial distribution by ~ 1/Vk in order to obtain reasonable acceptance rates. If 
the target distribution is Gaussian, and the proposal distribution is Gaussian, then the 
ideal acceptance rate is achieved by setting g = 2.38/^/k (Roberts et al., 1997). Thus, we 
initialised our algorithm with g as 2.38 /^/k as a first guess in order to hope to start with 
approximately good scaling. 

To ensure that the tuning of g is relatively optimal, before commencing a large MCMC 
run, we conducted small runs of 250 iterations and then compared the acceptance rate 
to the target rate range. If the acceptance rate is too high, we increased g, and if the 
acceptance rate is too low, we decreased g. The adjustment of g was done automatically 
by our algorithm. This process was then repeated until we a reasonable acceptance rate 
was achieved. Samples obtained in this way were discarded. 

Roberts et al. (1997) also noted that the efficiency of the Metropolis algorithm, compared 
to independent samples from the target distribution, is approximately 0.3/fc. As we add 
more components, the computational effort required to calculate the Voigt profiles in- 
low power of the number of parameters. Certainly, one needs n Voigt profiles, 
but typically profiles with more components occupy larger spectral regions, requiring eval- 
uation of the Voigt function at 0{k) points. Each Voigt profile has 3 parameters (i.e. 
k ~ 3n), although this may be reduced through parameter tying. In any event, this sug- 
gests that the time required to evaluate the likelihood function scales as 0{k'^). Combined 
with the result of Roberts et al. (1997), this implies that the approximate running time of 
our MCMC algorithm scales as 0{k^), thus justifying the earlier statement that our algo- 
rithm degrades only polynomially with increased dimensionality. A naive uniform Monte 
Carlo sampler, on the other hand, would have running times that scale as 0{c^) for some 
c. 

7-2.5.3 Covariance matrix re-estimation 

In order to try to ensure that the covariance matrix of our proposal distribution is well 
suited to the target distribution, we ran our MCMC algorithm multiple times (typically 
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five to ten) with several hundred thousand iterations per stage. After each stage, we re- 
estimated the covariance matrix from the chain. With sufficient numbers of stages, the 
covariance matrix should eventually converge on one which adequately samples the target 
distribution. Although, as noted, the formal correctness of the solution does not depend 
on XI, in practice if XI is badly tuned then the running time can become unacceptably 
large. Performing this re-estimation process drastically increases the chance that the final 
MCMC run will produce a good approximation of the underlying probability distribution. 

7-2.5.4 Proposal distribution 

As a starting point, we assume that the distribution of parameters is likely to be well 
approximated by a multidimensional Gaussian near the likelihood maximum for large 
numbers of degrees of freedom. It is well known, however, that the Voigt profile decompo- 
sition is not unique. This means that, away from a particular local likelihood maximum, 
we may discover multiple likelihood maxima (only one of which is a global maximum), 
as well as likelihood "shelves", and other interesting features. If any of these features 
occurs sufficiently close to the solution returned by vpfit, they should be observable in 
the MCMC chain. Note that the MCMC algorithm means that all of these features will be 
eventually reached, but if areas of relatively high likelihood are separated by large regions 
of low likelihood, the chance of discovering other likelihood maxima in a reasonable time 
is exceedingly small. 

If the target distribution is Gaussian, a Gaussian proposal distribution will yield good 
performance, provided that the proposal Gaussian is well tuned. However, if the tuning 
is bad initially, then sampling may be slow. To remedy this, we use a heuristic radial 
proposal distribution which has 

P{r) = g Qr^e-'-'/^ + le"^) . (7.19) 

This is an admixture of the radial component of a 2D Gaussian and an exponential distri- 
bution, similar to that used by CosmoMC^ (Lewis & Bridle, 2002). The rationale for this 
is that if the covariance matrix is poorly tuned, then large steps are rarely taken on ac- 
count of the e"*"^/^ term. An obvious potential problem is that initial guess for XI is badly 
matched to the true covariance matrix of the target distribution. Certainly, given that we 
are trying to verify whether the parameter estimates provided by vpfit are correct, we 
cannot assume that S is good. One typical problem is that one parameter uncertainty 
estimate is bad (that is, the error estimate seems implausibly large), perhaps because the 
data are overfitted, or because the evidence for that component is weak. Suppose that 
an initial covariance matrix is supplied where the uncertainty on one parameter is much 
larger than the true uncertainty. In this event, the tuning factor g will shrink until this 

"^See http://cosmologist.mfo/notes/CosmoMC.pdf for more details on this proposal function. 
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parameter is being reasonably well sampled, in order to achieve a reasonable acceptance 
rate. However, this means that exploration of other parameters will be very slow. As a 
result, re-estimation of the covariance matrix is less likely to obtain a useful estimate of 
Xl, as the exploration of the other parameters will not have occurred in a reasonable time. 
The exponential factor above helps to remedy this, by occasionally taking large steps. 
Additionally, for a /c-dimensional Gaussian (where k > 2) the probability of taking small 
steps is minimal. This, again, means that the proposal distribution must be well-tuned 
in order to achieve good exploration of the parameter space. The admixture of the ex- 
ponential yields a non-negligible probability of taking small steps. The proposal function 
given in equation 7.19 will demonstrate suboptimal performance if the target distribution 
is Gaussian and the scaling of 5] is good. On the other hand, where we have supplied co- 
variance matrices that are poorly tuned, or where the target distribution is non-Gaussian, 
this method appears to increase the likelihood of the final MCMC run being useful. 

To generate trial steps, we generate a vector of perturbations p from a spherically sym- 
metric distribution with radial probability density P(r) and then left multiply by L (where 
LL-*^ = Xl) so that the proposal has the correct covariance structure (Press et al., 2007). 
A new test point is thus given by x' = x -|- Lp. 

7-2.6 Chain thinning 

For large problems, storing the entire chain of values can be problematic, as this requires a 
matrix of size Nk, where is the number of iterations and k is the number of parameters. 
If the file is stored as human-readable (i.e. a text file), the file can become quite large for 
even moderate values of k. One solution to this is to "thin" the chain. That is, one only 
retains only every ith iteration of the chain. Provided that i <^ t, then one is is not 
throwing away large amounts of useful information. Indeed, thinning the chain has the 
effect of reducing the autocorrelation time as r — >-~ r/i. 

7-2.7 Multiple-Try Metropolis 

As noted in section 7-2.5.1, even for an optimal X), convergence slows down due to the 
need to take g ~ \l\fk. Although the acceptance rate will be reasonable, each step 
will typically take a step of only ~ aijyfk in each parameter, requiring long running 
times to adequately explore the parameter space. To attempt to combat this problem, 
we implement the Multiple-Try Metropolis method (MTM) (Liu et al., 2000; Liu, 2001). 
Rather than attempting a single step at each iteration, the MTM method tries many 
different steps in order to try to better explore the local parameter space. This is done 
in such a way as to maintain the detailed balance requirement (equation 7.13). The 
MTM proceeds as follows. Firstly, draw m independent trial proposals yi, . . . ,ym from 
a symmetric proposal function r(x, •). Compute for each trial value Wj = L^yj). Now, 
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from the trial set yi, . . . ,ym, select y with probability proportional to Wm- Then, produce 
a reference set by drawing x^, . . . x^_^ from the distribution T(y, •). Let x^, = x (which 



We refer to m as the cloud size, as the algorithm generates a cloud of points around 
the current point. Effectively, this algorithm allows larger potential step sizes whilst still 
maintaining a reasonable acceptance rate (Liu, 2001). 

Our experience is that this method significantly reduces the required running time by 
taking larger steps and by reducing the autocorrelation length of the chain. Although 
at each point one must generate the likelihood for 2m — 1 new points at any iteration 
(as opposed to just 1 for the standard Metropolis algorithm), we find that this extra 
computational burden appears to be more than offset in running time by the use of the 
MTM algorithm (similar to Liu et al., 2000). With experimentation, we have found that 
m = 10 worked well, in that the chain autocorrelation time was much smaller. We noted 
that for m ^ 10, the autocorrelation time did not seem to decrease faster than the 
computing time increased. Thus, m = 10 appears in our case to be a reasonable trade- 
off between exploring the parameter space around the current point (local investigation) 
and the need to take large numbers of steps to explore the parameter space fully (global 
investigation). We note that because the trial points yi,..-,ym and then the points 
xj, ... x^_j^ (after y has been selected) are independent, the MTM method is partially 
parallelisable. We have implemented this parallelisation using the OpenMP extensions to 
GFORTRAN. 

7-2.8 MCMC as a Bayesian sampler 

The MCMC method can be used to explore any probability distribution. However, MCMC 
can also be used to directly estimate posterior probabilities in the Bayesian framework 
with the appropriate choice of prior. The likelihood ratio then becomes L(x) — )■ L(x)7r(x) 
where 7r(x) is the Bayesian prior for a particular set of parameters. For the column 
densities and redshifts of transitions, we utilise improper flat priors. This method does 
not require normalised priors. Given that we are interested for these purposes in parameter 
estimation, we do not require normalised priors. We also use a flat prior on Aa/a as an 
agnostic position. 

Our experience has been that for the b parameters of transitions with small h (typically less 
than a few km/s) the algorithm tends to propose many movements to 5 < 0, which must 
be rejected (lines with 6 < are unphysical) . This is equivalent to imposing the improper 
prior 7r(6) = for 6 < and 7r(&) = 1 for & > 0. In any event, lines with b/uf, < 1 (where 
ah is the standard deviation of the b parameter estimated from the covariance matrix) 



preserves the detailed balance requirement). Also, create weights w* = -^^(x*). Now accept 
y with probability 
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cannot be Gaussian given tliat 6 > 0. To remedy this, we heuristically chose a flat prior 
for the logarithm of b, which tends to suppress movement to small b. This is implemented 
by simulating log^g ^ rather than b, and then using a uniform prior in logj^g ^- We have 
found that this prior provides signiflcantly better running times. 

In principle, one could use observed distributions of parameters as priors for the parameter 
estimates, however in most cases the statistical constraints on our line parameters are 
good. In this event, the prior is relatively flat across the region of interest of the likelihood 
function, which means that parameter estimates will only marginally be affected by this 
choice of prior. It is also for this reason that our choice of a logarithmic prior b does 
not lead to unnaturally large estimates of b. The correct speciflcation of priors is not 
particularly important in our case because the data quality is high; in this case, the 
likelihood is sufliciently restrictive to obtain a normalisable posterior. 



7-2.9 Hard limits 



Although we have implemented a uniform prior on the logarithm of the b parameters, this 
did not solve the problem associated with narrow lines entirely. Whereas when fltting in 
b the range of allowable values is 6 E (0,oo), when using log^o & the allowable range is 
log]^o ^ ^ (—00,00). For transitions with b much smaller than the instrumental resolution 
(~6 km/s for the VLT data), there is effectively no change in for small changes in b 
on account of the convolution. This means that, for example, b ~ 0.1 km/s is effectively 
indistinguishable from b = 0.01 km/s''. Two problems can arise from this. Firstly, our 
Voigt proflle model for each flux pixel was computed using a sub-binned proflle for each 
pixel, with some binning size. We have used nbin = 21. That is, the flux value for each 
pixel is calculated as the average of the model evaluated at 21 points straddling the pixel 
such that the bins are uniformly distributed between ±1/2 a pixel. If b becomes too small, 
however, this sub-binning will start to miss signiflcant amounts of flux, thereby rendering 
the profile generation incorrect. Secondly, even if we had arbitrarily large numbers of bins, 
we would observe that the distribution of log^g ^ normal because of the convolution. 

That is, the likelihood is effectively flat for bi <^ bip) (where 6ip is the width of the 
instrumental proflle), or log^ol^i) i$ ™ this case. Thus, for very narrow lines, log^g ^ 
is bounded above but unbounded below, and log^Q b will execute a random walk toward 
—00. In flnite precision algebra, this will eventually cause zero underflow, rendering all 
subsequent iterations meaningless. The only solution to this is to implement a hard 
boundary on log^^g ^ some point. We chose the boundary through experimentation to 
be = 1 km/s. We trialed bn^ = 0.1 km/s, but found that this signiflcantly degraded 
the performance of the algorithm. In future works, one can easily choose a smaller bn^a 
than we have, but we note simply that our choice of b^^ does not affect our conclusions. 

^6 cannot be arbitrarily small because of kinematic considerations, but vpfit has no prior knowledge 
of cloud kinematics, and thus in principle arbitrarily small bs may arise from the fitting process, vpfit 
implements a user-adjustable lower limit to 6 to prevent it becoming too small. 
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7-3 Application to quasar absorbers 

In this section, we present the application of our MCMC algorithm to three quasar ab- 
sorbers for the purpose of determining whether the estimates of Aa /a produced by vpfit 
are reasonable. Redshifts given below all refer to the absorption redshift of the system. 
In all three cases, we find good agreement between the vpfit result and that produced by 
our MCMC code, although the statistical uncertainties produced by our MCMC code are 
mildly smaller than those produced by vpfit, indicating that vpfit may be conservative. 
Interestingly, we note in the case of Q 0051—366 (section 7-3.3) that even though some 
of the parameters are clearly not Gaussian, Aa/a is, and that the vpfit result for Aa/a 
accords well with that derived from the MCMC method. We show the vpfit results com- 
pared to our MCMC algorithm in table 7.1, and give commentary on each of the absorbers 
and results below. 

Table 7.1: Comparison of purported values of Aa/a calculated by vpfit, and the results 
of our MCMC algorithm. Quoted uncertainties are la. lS.a/a is given in units of 10~^. 



Object 


^abs 


Aa/a(vPFiT) 


Aa/a(MCMC) 


LBQS 2206-1958 


1.018 


-0.51 ± 1.07 


-0.51 ±0.88 


LBQS 0013-0029 


2.029 


-0.86 ±0.94 


-0.83 ±0.78 


Q 0051-366 


1.748 


-0.80 ± 1.08 


-0.89 ±0.84 



7-3.1 LBQS 2206-1958 (J220852-194359) z = 1.018 

This absorption system appears to be well fitted by a single Voigt profile. We use the 
Mg II AA2796,2803 transitions, which are relatively insensitive to a variation, and the 
Fe II AAAA 2382,2600,2344,2587 transitions, which are strongly sensitive. We show the fit 
used for this absorber in figure 7.1 

As only a single component is fitted, we naively expect that the chain should be well- 
mixed even without re-estimation of the covariance matrix. This is true, at least by eye. 
Nevertheless, we carried out re-estimation of the covariance matrix in order to try to 
optimise the efficiency of the final run. We show the chain of Aa/a values in figure 7.2. 
For the Aa/a chain, r] = 53, which is much less than the chain length (N = 10^). In figure 
7.3 we show the histogram of these chain values, which yields the distribution of Aa/a. 
The distribution of Aa/a is well approximated by a Gaussian, and the mean value of the 
chain values corresponds extremely well with that produced by vpfit. Interestingly, the 
standard deviation of the chain values (cjmcmc = 0-88 x 10~^) is somewhat smaller the 
uncertainty estimate on Aa/a returned by vpfit (cjvpfit = 1-07 x 10~^). 

The parameters are approximately jointly Gaussian. This is expected for a single com- 
ponent fit — the Voigt profile decomposition is effectively unique with one component. 
We show the chain values of Aa /a plotted against the chain values of the column density 
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Figure 7.1: Fit for the z=1.018 absorber toward LBQS 2206-1958 (J220852-194359) 
used for our MCMC analysis. 
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Figure 7.2: Chain of Aa/a values for the z = 1.018 absorber toward LBQS 2206—1958 
(J220852— 194359). There are no long range correlations visible by eye, suggesting the 
sampling is good. 



of the Mg II A2796 component in figure 7.4. The probability density is larger where the 
density of points is greater. The joint distribution here is elliptical, and the individual 
distributions are Gaussian, suggesting that the joint distribution is well described by a 2D 
Gaussian. 
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Figure 7.3: Histogram of Aa/a chain values for the z = 1.018 absorber toward LBQS 
2206-1958 (J220852-194359). The resuhing distribution appears to be weh described by 
a Gaussian. 




Aa/a 

Figure 7.4: Chain values of Aa/a vs the log]^o[-^(l)]' where A^(l) is the column density 
of the Mg II A2796 component in atoms/cm^, for the z = 1.018 absorber toward LBQS 
2206—1958 (J220852— 194359). The two parameters appear to be jointly Gaussian. 



7-3.2 LBQS 0013-0029 (J001602-001225) z = 2.029 

This system appears with two obvious features. We find that the bluer feature is better 
fitted by two components than one on the basis of the AICC. Thus, we fit three components 
in total. Here, we use a wide variety of transitions, namely: Si ii A1526, Al III AA1854,1862, 
Fe II AAAAA2382,2600,2344,2587,1608 and Mg i A2852. We show the fit used in figure 7.5. 

In figure 7.6 we show an example of a stage where the covariance matrix is poorly tuned 
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Figure 7.5: Fit for the z = 2.029 absorber toward LBQS 0013-0029 (J001602-001225) 
used for our iVICiVIC analysis. The existence of two components in the blue feature is 
preferred over one on the basis of the AICC. However, because our parameter values for 
Al III are not tied to those in the other components, and because the optical depth of 
Al III is relatively low, the fit will only support a single component in Al iii in the blue 
feature. The presence of two tick marks per velocity component in the Al iii fits is due to 
the fact that the hyperfine components are explicitly shown. 



to the target distribution. One observes that the timescale required to retrace the path to 
the ~central value is on the order of thousands of steps, implying that on needs thousands 
of samples to obtain the equivalent of one independent sample. This intuition accords 
well with an explicit calculation, which yields t] ~ 1100. Whilst this chain will eventually 
adequately sample the parameter space, the running time this would take is between tens 
and hundreds of times longer than would be necessary if the covariance matrix was well 
tuned. This demonstrates the utility of re-estimating the covariance matrix multiple times. 

Our final iVICiVIC run here consisted of 600,000 iterations, where the chain was thinned 
by a factor of 5 to yield 120,000 samples. For the chain of Aa/a values, we find that the 
chain is well-mixed, with ~ 76 after thinning. We show the histogram of the chain of 
Aa/a values in figure 7.7, and note that it appears to be well described by a Gaussian. 



7-3.3 Q 0551-366 (J055246-363727) z = 1.748 

This absorption feature appears as a single weak feature next to one relatively strong fea- 
ture, with some overlap. We find that the bluer feature is well modelled by one component, 
however the higher wavelength feature appears to require two closely spaced components 
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Figure 7.6: Example of a chain that is not weh-mixed for the z = 2.029 absorber toward 
LBQS 0013-0029 (J001602-001225). This chain shows the chain values of Aa/a. Note 
that long-range correlations are easily visible by eye. Visually, the timescale taken to 
return to the central region is of order thousands of steps. This accords well with an 
explicit calculation of r/ ~ 1100. 

to achieve a statistical fit. We model the absorption with Si ii A1526, Mg i A2852, and 
Fe II AAAAAA 2382,2600,2344,2587,1608,2374. We show the fit used in figure 7.8. 

Our final MCMC run here consisted of 750,000 iterations. We note that the parameters 
corresponding to the reddest two components do not seem to be Gaussian. Figure 7.9 
shows the histogram of the column density of the central component of the Si ii A1526 fit, 
which displays clear deviations from Gaussianity. The column density returned by vpfit 
is, as expected, that given by the mode of the distribution, near N ~ 10^^ atoms/cm^. The 
covariance matrix, which gives the parameter uncertainty estimates, is determined at this 
point. As the distribution of this column density is not Gaussian, the uncertainty returned 
by VPFIT is not a good description of the true probability density for this component. In 
figure 7.11 we show the chain values of the column density of this component in Si ii A1526 
plotted against the velocity dispersion of this component. This shows significant deviations 
from the expected elliptical shape, indicating that a multivariate Gaussian is not a good 
description of the probability density of the parameters. 

We show in figure 7.12 the chain of samples for Aa/a, which has been thinned by a 
factor of 10. For the thinned chain, we obtain « 18 (which implies that r] ~ 180 for the 
unthinned chain), which means that the chain is well-mixed. We show the autocorrelogram 
of the chain of Aa/a values (thinned by a factor of 10) in figure 7.13. The autocorrelogram 
demonstrates rapid decay of the autocorrelation function compared to the chain length, 
which indicates that the chain possesses many independent samples. Figure 7.10 shows 
the histogram of the chain values for Aa/a. 
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Figure 7.7: Histogram of chain values of /S.a/a for the z = 2.029 absorber toward 
LBQS 0013-0029 (J001602-001225). The distribution appears to be weh described by a 
Gaussian. 
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Figure 7.8: Fit for the z=1.748 absorber toward Q 0551-366 (J055246-363727) used for 
our MCMC analysis. The existence of two components in the central feature is preferred 
over one on the basis of the AICC. Note that because the two components are closely 
spaced, we expect that the parameters of these two components will be non- negligibly 
correlated. 
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iogio[N(2)] 

Figure 7.9: Histogram of chain values of logio[-^(2)/cm^], where N{2) is the column 
density of the central component to the fit in Si II A1526 for the z = 1.748 absorber toward 
Q 0551—366. The units of A'^(2) are atoms/cm^, vpfit correctly finds the maximum 
likelihood estimate of the column density as ~ 10^^ atoms/cm^. However, there appears to 
be a probability shelf near = 10^^'^ atoms/cm^. This implies that the vpfit uncertainty, 
which is based on the covariance matrix at the maximum likelihood solution, are not a full 
description of the probability space, being based on the assumption that the parameters 
are jointly Gaussian. For this parameter, VPFIT gives log^g N = 13.0 =b 0.2. 
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Figure 7.10: Histogram of chain values of Aa/a for the z = 1.748 absorber toward Q 
0551—366. Note that Aa/a appears to be Gaussian, despite the fact that other parameters 
are not. 
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Figure 7.11: Plot of the values of the chain for logio[-^(2)] vs 6(2), where A^(2) is the 
column density of the central component of the fit for Si ii A1526 in atoms/cm^ and 6(2) 
is the velocity dispersion parameter of the same component in km/s, for the z = 1.748 
absorption system toward Q 0551—366. Note that the parameters are clearly not jointly 
Gaussian. The hard limit at the lower edge is caused by one of the other 6 parameters 
hitting a hard limit, namely that the h parameters must not decrease below 1 km/s. The 
rationale for this is given in section 7-2.9. 
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Figure 7.12: Chain values of Aa/a for the z = 1.748 absorption system toward Q 
0551—366. No large scale correlation is visible, implying that the chain is well-mixed. 
This run used 750,000 iterations, but the final chain has been thinned by a factor of 10. 
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Figure 7.13: Autocorrelation function for the chain values of Aa/a for the z = 1.748 
absorber toward Q 0551—366, where the chain has been thinned by a factor of 10. Note 
that the autocorelation function decays extremely rapidly compared to the chain length. 
As such, the chain as a whole provides many independent-equivalent samples of Aa/a. 
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Importantly, even though several of the parameters are not Gaussian, lS.a/a appears to 
be well described by a Gaussian distribution. We naively expect this, as /S.a/a should not 
be strongly correlated with the parameters and because there should be a unique value 
of Aa/a for each spectrum. The MCMC results here confirm our a priori beliefs about 
the distribution of Aa/a. Additionally, the estimate of the best- fitting value of lS.a/a 
returned by vpfit accords well with the values obtained from the MCMC samples (see 
table 7.1). For our purposes, we are primarily interested in the value of Aa/a; all other 
parameters are nuisance parameters. Therefore, it is extremely reassuring that vpfit 
produces good parameter estimates and uncertainties for Aa/a even in the presence of 
non-Gaussianity of some parameters. 



7-4 Discussion &; conclusion 



In this chapter, we have demonstrated successful application of MCMC techniques to ex- 
plicitly verifying the solution of VPFIT for simple metal absorbers. The MCMC technique 
is relatively robust, and the application to more complicated systems is limited by the com- 
puting power available. We note that the running time for Q 0551—366 (section 7-3.3) 
is several days. Application of this method to more complicated cases therefore requires 
either a great deal of patience or the use of supercomputing facilities (or both). More 
problematic is the fact that the running time to converge is unknown a priori. A more 
sophisticated version of our algorithm would use a variable number of stages, with some 
termination criteria based on autocorrelation, however we have not needed to implement 
that for our cases. Ultimately, we would like to apply our algorithm to substantially more 
complicated cases — in particular, we would like to verify the uncertainty on estimate of 
A/i//i set out earlier in this work. Unfortunately, these fits present with thousands of pa- 
rameters rather than tens. We nevertheless attempted to examine whether this problem 
was remotely tractable with our MCMC algorithm, and found that after one month of 
CPU time on a dual-core Pentium D 3.6 GHz that convergence had not been achieved. 
Thus, we leave this to future work. 

Ultimately, the primary goal of this work was to verify that the uncertainties produced by 
VPFIT are reasonable, and we have demonstrated that this is true for simple situations. 
Experience with vpfit suggests that there does not appear to be any indication of failure 
with moderately complicated circumstances, and so we argue both that the optimisation 
algorithm used by vpfit is robust and that the uncertainties produced are reasonable. 
An incidental consequence of this work is demonstrating that the Gauss-Newton approx- 
imation to the covariance matrix given by equation 7.8 is good — if it were not, the un- 
certainties produced by vpfit would differ substantially from those given by the MCMC 
algorithm. 

One intriguing possibility for future work is the use of nested sampling (Skilling, 2004; 
Feroz &: Hobson, 2008), which appears to cope well with both multimodal distributions 
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and high dimensionahty. Not only does this method produce samples from the posterior 
(as we obtain here with MCMC), but importantly one obtains the Bayesian evidence, 
thereby allowing the direct comparison of competing models. Nested sampling transforms 
the multidimensional evidence integral (which is often notoriously difficult to evaluate) into 
a one dimensional version which can be approximated using the trapezium rule provided 
one can sample from the prior, where the drawn sample must have L{'x) > Lj for some 
Lj. The work of Fcroz & Hobson (2008) provides a method to decompose likelihoods 
of arbitrary complexity into a series of ellipsoidal approximations, to which the nested 
sampling algorithm can be applied. Although in principle this method works well for high 
dimensionality, the ellipses provide hard boundaries outside which samples will not be 
drawn, and therefore the ellipse sizes must be chosen carefully so as not to miss significant 
regions of the likelihood. They proposed an enlargement factor, (1 + /), by which the 
ellipses should be grown so as not to miss points. Unfortunately, this re-introduces the 
curse of dimensionality unless / ~ 0, as the regions of interest will only constitute ~ 
1/(1 + f)^ of the sampled volume. However, Feroz & Hobson (2008) noted that the 
time required is less than for MCMC implementations, and therefore this technique shows 
significant promise. Application of this method to the molecular hydrogen data may prove 
fruitful, however to successfully tackle this challenge substantial advances in computing 
speed will likely be required. It may simply be, however, that with thousands of free 
parameters, full MCMC exploration of a realistic molecular hydrogen fit may remain out 
of reach for some time. 
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Conclusions 



In this thesis, we have utiHsed the fact that high precision spectroscopy allows precise 
redshift measurements of quasar absorption lines to investigate the potential variation 
in the proton-to-electron mass ratio, fi, and fine-structure constant, a. All the data 
have been obtained with VLT/UVES, and all of the data are publicly available through 
the ESO Science Archive, which helps to facilitate verification of these results and rapid 
science generally. Below, we summarise the main conclusions of this work 

1. In chapter 3, we investigated possible variation of using molecular hydrogen ab- 
sorbers in high quahty spectra of the quasars Q0405-443, Q0347-383 and Q0528-250 
We attempted to improve our analysis over that from previous works by modelling 
the Lyman-a forest simultaneously with the II2 transitions, thereby accounting for 
a clear source of uncertainty in determining the positions of the II2 line centroids. 
The wavelength calibration of our spectra utilises a more accurate ThAr calibration 
algorithm, which should significantly reduce wavelength calibration errors compared 
to previous analyses. We have also explicitly accounted for the under-estimation of 
flux uncertainties in regions of low flux that occurs when the spectra are reduced 
using the midas pipeline. 

2. We found no statistically significant evidence for evolution in /x over cosmological 
timescales, with a weighted mean of the A/i//i values from our direct minimisation 
method (DCMM) analysis of A/i//i = (2.6±3.0) x 10"*^ (statistical). The individual 
values of ^fJ-ffJ- are themselves consistent with zero, being (10.1 ±6.6) x 10^^, (8.2 it 
7.5) X 10-^ and (-1.4 ± 3.9) x 10"^ for Q0405-443, Q0347-383 and Q0528-520 
respectively. We are therefore unable to reproduce the evidence for a change in n 
found by Reinhold et al. (2006). 

3. We also analysed the absorbers toward Q0405— 443 and Q0347— 383 using the re- 
duced redshift method (RRM), and found results consistent with those derived from 
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the DCMM. Importantly, our RRM results show Xu ^ ^ the values of the re- 
duced redshift, Ci: about the best linear fit. This is in contrast with the results of 
Reinhold et al. (2006), whose Q values showed excess scatter, with xt = 2.1. We 
can say, at least on the basis of the observed scatter of our data, that there appears 
to be no evidence for unmodelled systematics in our analysis. We attributed this to 
a combination of better wavelength calibration and the fact that we modelled the 
Lyman-a forest in the vicinity of the H2 transitions, which should lead to a more 
robust estimate of uncertainties. 

4. We noted exphcit advantages of the DCMM over the RRM, in that the DCMM al- 
lows analysis of systems with overlapping velocity components, which is not possible 
within the RRM. By fitting the components in the Q0528— 250 absorber simultane- 
ously, we were able to obtain an extremely precise measurement of Afi/fi. 

5. We re-analysed the absorber toward Q0528— 250 using new observations. We in- 
vestigated possible systematic errors, including: i) systematic distortions in the 
wavelength scale due to ThAr calibration uncertainties; ii) intra-order wavelength 
distortions; in) potential velocity segregation between cold (J G [0, 1]) and warm 
(J € [2, 4]) components, and; iv) the effect of re-dispersion of the spectra. We found 
that Afi/n = (0.2 ± 3.2stat ± l-9sys) x lO'^, or (0.2 ± 3.7) x lO^^ if one aggregates 
the effect of statistical and systematic errors. 

6. A weighted mean of all our values of A/i/^u yields (1.7 it 2.4) x 10^^ — an extremely 
stringent bound on any change in fj,. Including the result obtained from J2123— 0050 
by Malec et al. (2010) yields (A^/^)^ = (2.2 ± 2.2) x 10"*^. The results of chapter 
3 are the best z > 1 constraints on Afx/fi available. 

7. In chapter 4, we applied the many-multiplet method to a large sample of quasar 
absorbers, the spectra for which have been obtained over several years by many 
different observers on VLT/UVES. Our aim was to produce a sample of comparable 
size to the Keck sample (Murphy et al., 2004), with which we might be able to 
support or contradict the 5a evidence found from Keck/HIRES that Aa/a < at 
cosmological redshifts. In particular, with the result from Murphy et al. (2004) that 
Aa/a = (—0.57 lb 0.11), we argued that a comparable sample might be able to show 
inconsistency with Keck at the ~ 3.5a level if Aa/a = 0. 

8. Our final VLT sample consists of 153^ absorbers from 60 different sightlines. This 
is the largest statistical sample of MM absorbers presented in any work so far. The 
data quality is extremely good, representing the amalgamation of many exposures 
taken over approximately 100 nights at the VLT. Even after accounting for random 
errors, the data quality allow us to constrain Aa/a at the few parts-per-million level. 

9. For the VLT Aa/a values taken by themselves, under a weighted mean model we 
found that Aa/a = (0.21 it 0.12) x 10~^. This result is inconsistent with the Keck 

^Excluding one absorber which was flagged as an outher. 
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results at the 4.7a level. However, we showed that both the VLT and Keck Aa/a 
values can be made consistent if one assumes that spatial variation in a exists. 
When we applied a simple model for angular variation in a, Aa/a = AcosQ + m, 
to the VLT Aa/a values we found a preference for a dipole+monopole model over 
a monopole-only model at the 2.2a level. Combining this with the Keck Aa/a 
values, we found 4.1(7 evidence for angular variation in a (in the sense that the 
dipole+monopole model is preferred over the monopole-only model at the 4.1(7 level), 
having amplitude A = O.QTIqJo ^ 10~^, and pointing in the direction RA = (17.3 ± 
1.0) hr, dec = (-61 ± 10)°. 

10. We showed that the VLT and Keck data demonstrate a remarkable internal consis- 
tency, in that the dipole directions in a dipole+monopole model fitted to the Keck 
and VLT Aa/a values point in a similar direction (with a chance probability of 
alignment of 6 percent), and also that the dipole directions in a dipole+monopole 

model fitted to low and high redshift cuts of the data (split at z = 1.6) also point in 
a similar direction (with a chance probability of alignment of 2 percent). The joint 
probability of obtaining alignment as good as is seen in both these cases is just 0.1 
percent (equivalent to « 3.3a). 

11. We noted the presence of a monopole at z < 1.6, which is statistically significant at 
the 3.6(7 level. If real, this would represent an angle-independent change in a relative 
to laboratory values. Although the monopole is unusual, we showed that both the 
VLT and Keck samples yield extremely consistent estimates of its value. This means 
that, whatever the cause of the monopole, it can not be responsible for the observed 
angular variation in a. We discussed several possible explanations for the presence 
of this monopole, and concluded that evolution in the abundance of magnesium 
isotopes is the most likely cause rather than universal temporal evolution in a. The 
lack of clear explanation of the monopole in the low-redshift sample is a weakness of 
the results presented here, but on account of the good consistency between the Keck 
and VLT results and the angular component of low- and high-redshift samples we do 
not think that its existence significantly affects the detection of angular variations in 
a. Future observations should be able to determine what the cause of the monopole 
is. 

12. We showed that the dipole effect demonstrated is not caused by small numbers of 
outlying data points, by iteratively clipping away Aa/a values about the model and 
demonstrating the effect this has on both the significance of the dipole and the fitted 
direction. W also showed that the effect is not being caused by a small number of 
aberrant spectra, by exploring the influence of randomly removing spectra. 

13. On account of the fact that the observed angular variation in Aa/a is larger at high 
redshifts, we explored simple distance-dependent models, where the dipole amplitude 
scales as z^ for some /3, and also where the amplitude scales with the lookback-time 
distance to the absorbers, r = ct. We show that for the model Aa/a = BrcosQ + m 
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that the statistical significance of the dipole effect increases to 4.2a, and the dipole 
points in the direction RA = (17.5 it 1.0) hr, dec = (—62 it 10), with amplitude 
B = {1.1± 0.2) X 10-6 GLyr-i. 

14. We concluded that the results set out in chapter 4 present strong statistical evidence 
for spatial variations in a. 

15. In chapter 5, we considered the effect of some possible systematic errors on Aa/a. 
We argued there that the dipole effect seen is intrinsically difficult to emulate through 
systematic effects, because the systematic effect must either be well correlated with 
sky position (in both the Keck and VLT telescopes), or there must be a conspiracy 
of systematic effects that by chance produces angular variation in a in an extremely 
consistent way between the two telescopes. One obvious consideration is the effect 
of wavelength calibration at both the Keck and VLT telescopes; wavelength scale 
distortions could easily lead to spurious values of Aa/a. To empirically investigate 
possible wavelength distortions, we noted that there are 7 quasars that appear in 
both the Keck and VLT samples. We can use the fact that observations of absorption 
lines from both telescopes should yield the same observed wavelengths to create the 
Av test. In the Av test, one fits many absorption lines at different wavelengths 
in each quasar spectral pair, but allows for and estimates a velocity difference in 
corresponding spectral regions between the two telescopes. 

16. We utilised Av data for the 7 spectral pairs in the VLT and Keck samples to inves- 
tigate whether common wavelength distortions exist. For all of the spectral pairs, 
there is no evidence for a common wavelength distortion. Unfortunately, each spec- 
tral pair only provides values of Av for a limited wavelength range. We combined 
the Av data from six of the spectral pairs and modelled the distortion with a simple 
linear function. Using both the LTS and SBLR methods, we are unable to find 
statistically significant evidence for a common linear wavelength distortion. Nev- 
ertheless, we modelled the impact of the measured distortion on the dipole. This 
reduced the statistical significance of the dipole model for the combined Keck -|- VLT 
sample from 3.9a^ to 3.1(7, thus not eliminating the dipole signal. Importantly, this 
did not appreciably alter the location of the fitted dipole. Therefore, despite the 
reduced statistical significance, the application of this Av function does not destroy 
the good alignment between the Keck and VLT dipole vectors, nor between dipole 
models fitted to z < 1.6 and z > 1.6 sample cuts. 

17. We also considered the 7th spectral pair, 2206-1958/J220852-194359. This spectral 
pair displays significant relative wavelength distortions. On account of the restricted 
wavelength range of the Av data, we modelled the observed Av data with an arctan- 
gent approximation and extrapolated to red wavelengths. We explored the impact 
of this function on Aa/a and found that a distortion of this magnitude, if present 

^Calculated from a reference set. 
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in all spectra, would generate an extremely strong signal for lS.a/a that is observed 
in neither the Keck nor VLT data sets. 

18. We applied a Monte Carlo method to apply the (non-significant) common linear /S.v 
function from 6 of the spectral pairs to 6/7 of the VLT spectra chosen at random 
and the arctangent At; function from the 2206-1958/J220852-194359 pair to the 
remaining 1/7 of the VLT spectra. We found that in almost all cases, this signifi- 
cantly increases the AICC, allowing us to reject the presence of a distortion of this 
type in most cases. We consider therefore consider it unlikely that our data are 
affected by a combined wavelength distortion of this type. 

19. We considered the impact of the echelle intra-order distortions found by Whitmore 
et al. (2010) on the combined results. Using a simple model for the distortion in the 
VLT data, with a peak-to-peak amplitude of « 300 ms~^, we found that the impact 
on the location of the dipole was relatively small. The significance of the VLT dipole 
is reduced from 2.2a to l.Gu, but this largely due to the randomising effect that a 
model of this type has on the Aa/a values. As a result, the significance of the 
VLT-|-Keck dipole is reduced to 3.3(T. Importantly, because the intra-order distor- 
tions do not demonstrate any long-range component, the effect of the distortions is 
to add random noise into the Aa/a values; they can not manufacture a dipole or 
monopole. In fact, we have already accounted for random effects like this by con- 
servatively increasing our error bars, and so we consider that the distortions found 
by Whitmore et al. are already accounted for adequately in our VLT angular dipole 
significance estimate of 2.2o" (and, thus, the VLT-|-Keck angular dipole significance 
estimate of 4.1(7 ). 

20. Ultimately, the sample size we have for the At; test is small. A strong priority 
for future work should be obtaining observations of the same objects from both 
telescopes, so that wavelength dependent systematics may be better constrained. 
Nevertheless, from the considerations in chapter 5 we argue that it is unlikely that 
wavelength distortions are responsible for the observed angular variation in a. 

21. We also considered the effect of variation in the heavy Mg isotope fraction, L, as 
Aa/a is sensitive to variation in T from the terrestrial value of Tt = 0.21. We 
investigated the extreme case L = by discarding the ^^Mg and ^^Mg fraction (i.e. 
by fitting absorbers with only ^^Mg). We show that the effect of this is to push the 
Aa/a values to be more negative, inducing a greater significance for any monopole 
component of a model. Importantly, this experiment has no effect of consequence 
on the fitted dipole locations, reinforcing our earlier argument that any systematic 
which generates an angular variation of a must be well correlated with sky position. 
Despite the increased scatter introduced into the data as a result of this investigation, 
the dipole model still remains significant at the 3.5(T level. We also investigated the 
possibility that the quasar clouds show an enriched heavy Mg fraction relative to 
terrestrial values. By considering the effect this has on the z < 1.6 monopole, we 
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show that the monopole could be made to disappear if F = 0.32 it 0.03, which is 
significantly higher than the terrestrial value of Tt = 0.21. Ultimately, more work is 
needed to resolve this issue, but differences in the heavy Mg isotope fraction cannot 
be responsible for the observed angular variation in a. 

22. Thus, we cannot find a systematic effect which is responsible for the detected spatial 
variation in a. We cannot rule out the possibility that the detection of angular 
variations in a presented here is the result of a conspiracy of systematic effects, or 
a systematic effect in both the VLT and Keck telescopes which is well-correlated 
with sky position. We argued in chapter 5 that zenith-dependent systematics are 
unlikely. A systematic effect which is correlated specifically with declination in the 
same way in both telescopes would be very unusual, and we are not aware of any 
process which could generate this. Future observations with a third telescope will 
help to rule out telescope-dependent systematics. 

23. In chapter 6, we reviewed the consistency of the Afi/fi and Aq/q results. If both 
sets of results are correct, they immediately imply that \R\ < 3 if A^/ fx = R{Aa/a), 
which contradicts the quite general predictions of GUTs and string theory models 
that \R\ ~ 35. Although the Afi/fi results including the two z < 1 NII3 constraints 
do not suggest spatial variation in fj, that is consistent with the a results, if we only fit 
the ^fx/ n constraints derived from the II2 data we find that the fitted dipole points 
in a similar direction to the Aa/a dipole, with the dipole vectors being separated by 
only 18°. The interpretation of this is obviously hampered by a very limited sample 
size. 

24. We also reviewed the Aa/a results in the context of other observations, both local 
and astrophysical, and concluded that the results of chapter 4 are not in conflict 
with any other existing constraints on the variation of fundamental constants. 

25. We discussed various claims for anisotropy in the universe, and noted that there is an 
intriguing, but far from conclusive loose alignment between different measurements 
of possible anisotropy in the universe. 

26. In chapter 7, we investigated whether vpfit produces correct parameter estimates 
and statistical uncertainties by applying Markov Chain Monte Carlo methods. MCMC 
methods completely dominate traditional Monte Carlo methods (random sampling 
of the likelihood function) for high dimensions; degradation with increasing dimen- 
sionality is only polynomial, whereas traditional Monte Carlo methods degrade ex- 
ponentially. We modified vpfit to allow for MCMC exploration of the likelihood 
function of the Voigt profile fit using a modification of the well-known Metropolis 
sampler, the Multiple Try Metropolis method. We applied the resultant algorithm to 
several simple Voigt profile fits. Despite the advantage of MCMC methods, reason- 
able exploration of the parameter space nevertheless takes hours to a few days. We 
verified what we set out to check: that the statistical estimates of Aa/a produced 
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by VPFIT are reasonable. We also demonstrated that even where the joint likelihood 
function is non Gaussian for individual line parameters, the likelihood for Aa/a is 
indeed Gaussian. This is expected, but reassuring, and justifies the use of only a 
maximum likelihood estimate and standard error when describing the estimate of 
Aa/a for an absorber; higher order moments (skewness, etc) can safely be neglected. 
The results of this chapter give confidence to the results of Murphy et al. (2004) and 
this work in investigating potential changes in fi and a. 

27. We attempted to apply MCMC methods to A/x/^u, but found that even with am- 
ple computing resources the problem remains intractable with a Metropolis-type 
sampler. We noted that this problem may become directly amenable in the future 
as a result of better computing facilities, but also noted that recent advances (e.g. 
nested sampling) may also assist in directly investigating the likelihood function for 
complicated molecular hydrogen fits. 



8-1 Future work 



We have described in various places throughout this work how future research may be able 
to shed more light on whether the fundamental constants truly vary. The results of chapter 
4 are exciting, in that they yield evidence for variation in a that is independent of and 
consistent with that obtained from Keck/HIRES. The most obvious experimental path 
that is complimentary to the Keck/VLT work is the use of a third telescope/spectrograph 
combination; Subaru/HDS (High Dispersion Spectrograph) is currently the best choice. 
The coming decade should see construction of one or more next-generation extremely large 
optical telescopes, with primary mirror diameters of at least ~ 20 metres (and possibly as 
high as ~ 40) depending on the ultimate design. The spectrographs for these telescopes 
will be built with extremely precision and stability in mind. 

ESPRESSO (Echelle Spectrograph for Rocky Exoplanet- and Stable Spectroscopic Ob- 
servation) has recently been approved for construction and installation at the VLT, with 
operation scheduled to commence around 2014. Although the instrument will be able to 
operate in 1-UT mode (using the light from a single VLT telescope), it will also be able 
to operate in 4-UT mode, where the light from all four VLT telescopes is collected at 
an incoherent focus, giving a collecting area equivalent to a 16m telescope. The spec- 
trograph is designed to achieve R = 140,000 and lOcms^^ precision for radial-velocity 
planet searches, which would in principle allow the detection of Earth-like planets. Molaro 
(2007) discussed the science case for ESPRESSO in the context of variation of fundamen- 
tal constants, and suggests that 30 ms""*^ precision on narrow lines may be achievable with 
a few hours integration. A shift of this magnitude corresponds to Aa/a ~ 1.4 x 10~^ for 
the Fe ii A2382 transition. This in principle enough to accurately determine whether the 
results of chapter 4 are correct or not unless systematic or random effects are significant. 
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The results of chapter 4 suggest that random effects of order ~ 10 ^ exist; these may 
reduce with higher quality observations and instruments, but also may not. 

Liske et al. (2009) discussed CODEX (COsmic Dynamics Experiment), the planned high- 
resolution optical spectrograph for the E-ELT (European Extremely Large Telescope) . The 
primary purpose of the spectrograph is to directly observe the expansion of the universe 
by measuring changes in the redshifts of absorption features. The intended target is the 
Lyman-a forest, as it provides many lines over a large redshift range. To achieve its science 
goals, CODEX will need to deliver a radial velocity accuracy of 2cms~^ over a time-scale 
of ~ 20 years. If CODEX can deliver such precision then this will potentially improve the 
current constraints on Aa/a and A^u/// by several orders of magnitude. However, it must 
be said that it is not clear to what extent this precision will be limited by uncontrollable 
systematics or random effects (for example, the kinematics of the quasar absorbers). 

Wavelength calibration uncertainties remain a significant problem for optical spectroscopic 
measurements, as the results of Gricst ct al. (2010) and Whitmore et al. (2010) demon- 
strate. Even if issues regarding the quasar light path can be removed, the ThAr standard 
used is itself problematic. The ThAr lines are unevenly distributed across the visual spec- 
trum, and the intensity of the lines differs greatly. The calibration in some echelle orders 
is inevitably sub-optimal due to low numbers of usable ThAr lines. Laser combs have re- 
cently held out promise of vastly better wavelength calibration; laser combs can generate 
evenly spaced transitions across the optical spectrum for which the absolute calibration is 
known a priori. Steinmetz et al. (2008) discussed the first implementation of laser comb 
calibration at an astronomical observatory, achieving 9ms~^ radial velocity precision at 
1.5/im, which they described as "beyond state-of-the-art". Murphy et al. (2007b) discussed 
simulations of optical laser combs which show that integration over a 4000A range could 
produce calibration uncertainties of as low as 1 cms~^, which has the potential to "remove 
wavelength calibration uncertainties from all practical spectroscopic experiments". For 
these precisions to be realised, combs will need to demonstrate increased pulse repetition 
rates and more uniform intensity over the optical range compared to what is available at 
present. 

Near-term verification (or otherwise) of evolution of the fundamental constants may occur 
more rapidly with radio measurements, as noted earlier. New facilities are scheduled to 
commence operation shortly which will offer extreme precision. For instance, the Square 
Kilometre Array (SKA) will be a radio telescope of unparalleled sensitivity due to its 
enormous collecting area. Although observations are not scheduled to start until 2017, 
the Australian and South African pathfinder telescopes (ASKAP and MEERKAT respec- 
tively) will become operational before this. Curran et al. (2004) considered then-current 
results on the variation of fundamental constants and existing biases in surveys for radio 
absorbers in the context of the SKA. The Atacama Large Millimetre Array (ALMA) will 
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also soon be operational. Combes (2009) reviews existing constraints on fundamental con- 
stants with radio lines with some consideration given to estimates of the increased number 
of sources detectable with ALMA. 

Ultimately, continuing improvements in atomic clocks and the new instrumentation that 
will be available for astrophysical measurements over the next decade means that the 
future for investigations into whether the fundamental constants vary is bright. 
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Q0405-443 Voigt profile fits 



In this appendix, we provide the fits for the z = 2.595 H2 absorber toward Q0405— 443. 
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Figure A.l: H2 fit for tlie z = 2.595 absorber toward Q0405— 443 (part 1). Tlie vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 2: H2 fit for the z = 2.595 absorber toward Q0405-443 (part 2). The vertical 
axis shows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 3: H2 fit for tlie z = 2.595 absorber toward Q0405-443 (part 3). Tlie vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A.4: H2 fit for the z = 2.595 absorber toward Q0405— 443 (part 4). The vertical 
axis shows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 5: H2 fit for tlie z = 2.595 absorber toward Q0405— 443 (part 5). Tlie vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 6: H2 fit for the z = 2.595 absorber toward Q0405— 443 (part 6). The vertical 
axis shows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 7: H2 fit for tlie z = 2.595 absorber toward Q0405-443 (part 7). Tlie vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 8: H2 fit for the z = 2.595 absorber toward Q0405-443 (part 8). The vertical 
axis shows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 9: H2 fit for tlie z = 2.595 absorber toward Q0405-443 (part 9). Tlie vertical 
axis shows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 10: H2 fit for the z = 2.595 absorber toward Q0405-443 (part 10). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 11: H2 fit for the z = 2.595 absorber toward Q0405-443 (part 11). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 12: H2 fit for the z = 2.595 absorber toward Q0405-443 (part 12). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 13: H2 fit for the z = 2.595 absorber toward Q0405-443 (part 13). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure A. 14: H2 fit for tlie z = 2.595 absorber toward Q0405-443 (part 14). Tlie 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlcj. Labels for the H2 transitions are plotted below the data. 
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Q0347-383 Voigt profile fits 



In this appendix, we provide the fits for the z = 3.025 H2 absorber toward Q0347— 383. 
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Figure B.l: H2 fit for tlie z = 3.025 absorber toward Q0347-383 (part 1). Tlie vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.2: H2 fit for tlie z = 3.025 absorber toward Q0347-383 (part 2). The vertical 
axis shows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.3: H2 fit for tlie z = 3.025 absorber toward Q0347-383 (part 3). Tlie vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.4: H2 fit for tlie z = 3.025 absorber toward Q0347-383 (part 4). The vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.5: H2 fit for tlie z = 3.025 absorber toward Q0347— 383 (part 5). Tlie vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.6: H2 fit for tlie z = 3.025 absorber toward Q0347— 383 (part 6). The vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.7: H2 fit for tlie z = 3.025 absorber toward Q0347-383 (part 7). Tlie vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.8: H2 fit for tlie z = 3.025 absorber toward Q0347-383 (part 8). The vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 



267 



268 



B. Q0347-383 Voigt profile fits 



— I 1 1 1 r 



T 1 1 T 



I I I I I I I I I II I I 




_l I I L. 



_L 



_i I I i_ 



_L 



4190 



4195 



4200 



1 r 



T 



T r 



-l 1 



I I 



I II I 



I I I I I I II 



CM 

in 




4180 



4185 



4190 



T 1 r 



1 1 1 1 n 



1 r 



-] 1 

I I 



I I I |l I I I I II I I 




Q. 

in 



CM 

in 




4160 



4165 



4170 



o - 



4150 




4155 

Wavelength (A) 



4160 



Figure B.9: H2 fit for tlie z = 3.025 absorber toward Q0347-383 (part 9). Tlie vertical 
axis sliows normalised flux. The model fitted to the spectra is shown in green. Red 
tick marks indicate the position of H2 components, whilst blue tick marks indicate the 
position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 



268 



269 




4240 



4245 



4250 



1 1 1 1 1 r 

III I I I 



T r 



I I 



CM 
01 

_l I I L. 



CM 



_L 



_i I I i_ 



_L 



4230 



T r 



4235 



4240 



l| II I I I II I III 




_l I I L. 



_i I I i_ 



4220 




4210 



T r 



4215 
— I — 



T 1 1 r 



4220 
— I — 



4200 4205 4210 

Wavelength (A) 

Figure B.IO: H2 fit for the z = 3.025 absorber toward Q0347-383 (part 10). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.ll: H2 fit for the z = 3.025 absorber toward Q0347-383 (part 11). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.12: H2 fit for the z = 3.025 absorber toward Q0347-383 (part 12). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.13: H2 fit for the z = 3.025 absorber toward Q0347-383 (part 13). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure B.14: H2 fit for the z = 3.025 absorber toward Q0347-383 (part 14). Tlie 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Appendix C 



Q0528-250:A Voigt profile fits 



In this appendix, we provide the fits for the z = 2.811 H2 absorber toward Q0528— 250. 
The fit relates to our first analysis of this object, published in King et al. (2008). The 
analysis for A/i//i is set out in section 3-4.3. 
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Figure C.l: H2 fit for tlie z = 2.811 absorber toward Q0528-250:A (part 1). Tlie 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.2: H2 fit for the z = 2.811 absorber toward Q0528-250:A (part 2). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.3: H2 fit for tlie z = 2.811 absorber toward Q0528-250:A (part 3). Tlie 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.4: H2 fit for the z = 2.811 absorber toward Q0528-250:A (part 4). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.5: H2 fit for tlie z = 2.811 absorber toward Q0528-250:A (part 5). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.6: H2 fit for the z = 2.811 absorber toward Q0528-250:A (part 6). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.7: H2 fit for tlie z = 2.811 absorber toward Q0528-250:A (part 7). Tlie 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 



282 



283 




T r 



I I I I I I I nil I I I 




CD 

_± 



3920 



T r 



3925 



3930 



_i I I i_ 



_L 



_i I I i_ 



Nil 




3910 



3900 



3890 



3880 



3915 



3905 



3895 



3885 

Wavelength (A) 



3920 




3910 




3900 




3890 



Figure C.8: H2 fit for the z = 2.811 absorber toward Q0528-250:A (part 8). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.9: H2 fit for the z = 2.811 absorber toward Q0528-250:A (part 9). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent ztla. Labels for the H2 transitions are plotted below the data. 
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Figure C.IO: H2 fit for the z = 2.811 absorber toward Q0528-250:A (part 10). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.ll: H2 fit for tlie z = 2.811 absorber toward Q0528-250:A (part 11). Tlie 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.12: H2 fit for the z = 2.811 absorber toward Q0528-250:A (part 12). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.13: H2 fit for tlie z = 2.811 absorber toward Q0528-250:A (part 13). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure C.14: H2 fit for the z = 2.811 absorber toward Q0528-250:A (part 14). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Appendix D 



Q0528-250:B2 Voigt profile fits 



We present here our reanalysis of the spectrum of the z = 2.811 absorber toward Q0528— 250, 
using exposures obtained on VLT/UVES under program ID 82.A-0087. We describe the 
results of this analysis in section 3-6. 
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Figure D.l: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 1). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.2: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 2). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.3: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 3). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.4: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 4). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.5: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 5). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.6: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 6). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.7: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 7). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.8: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 8). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.9: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 9). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent ztla. Labels for the H2 transitions are plotted below the data. 
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Figure D.IO: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 10). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.ll: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 11). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.12: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 12). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.13: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 13). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Figure D.14: H2 fit for the z = 2.811 absorber toward Q0528-250:B2 (part 14). The 
vertical axis shows normalised flux. The model fitted to the spectra is shown in green. 
Red tick marks indicate the position of H2 components, whilst blue tick marks indicate 
the position of blending transitions (presumed to be Lyman-a). Normalised residuals (i.e. 
[data - model] /error) are plotted above the spectrum between the orange bands, which 
represent itlo". Labels for the H2 transitions are plotted below the data. 
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Appendix E 



Many- mult iplet Voigt profile fits 



In this appendix, we provide the fits for the many-multiplet systems considered in chapter 
4. Each absorber is plotted on a velocity scale, such that corresponding components align 
vertically. Velocities are given as differences from an arbitrary redshift, which is usually 
chosen to be close to the maximum optical depth of the absorber. The positions of fitted 
components are indicated by blue tick marks. Plotted above each fit are the residuals of 
the fit, that is [fit-data] /error, where the error is the la uncertainty associated with each 
flux pixel. The two red lines indicate iblcr, within which the residuals are expected to 
occur about 68% of the time if the errors are Gaussian, the error array is correct and the 
fitted model is a good representation of the data. 

Each plot contains a maximum of 16 regions. In the event that there are more fitting 
regions than this, the fit is split into several parts. Each part may contain common 
transitions so as to provide a common reference, and to illustrate the velocity structure 
more clearly. 
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Figure E.l: Many-multiplet fit for the z = 0.452 absorber toward J000344-232355. 



308 



309 



=1-1 



O 
O 
CO 

13 



I I j I I I I I I I I I 



CD 

cn . 

CN 

I I I i- 



I I J I I I I I I I I 



I I f I I I I 



CNl 

13 

Li_ 

I I I I 



-I 



ro -I- 
O . 

00 

CN : 
"ai -- 



T Ij I I I I I I I I I I 



— i 



CN 

LD 

OO 
CN 

cn 



I I M I I I I I I I 



CN 

OO H 

CN 
13 

Li_ 

I I I I I i- 



V ✓ 




o 








o 




















O 






o 








1 


O 


CT) 


m 

1 

1 




y 

1 1 




II 

N 


o 




o 








T 








o 




o 








>^ 




-1— ' 




'o 


o 


o 


in 


(D 




> 


o 





o 

LO 

I 



o 
o 



L 5"0 I 

Figure E.2: Many-multiplet fit for the z = 0.949 absorber toward J000344-232355. 
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Figure E.3: Many-multiplet fit for the z = 1.586 absorber toward J000344-232355. 
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Figure E.4: Many-multiplet fit for the z = 1.542 absorber toward J000448-415728. 
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Figure E.5: Many-multiplet fit for the z = 1.989 absorber toward J000448-415728. 
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Figure E.6: Many-multiplet fit for the z = 2.168 absorber toward J000448-415728. 
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Figure E.7: Many-multiplet fit for the z = 1.203 absorber toward J001210-012207. 
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Figure E.8: Many-multiplet fit for the z = 0.635 absorber toward J001602-001225. 
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Figure E.9: Many-multiplet fit for the z = 0.636 absorber toward J001602-001225. 
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Figure E.IO: Many-multiplet fit for tlie z = 0.857 absorber toward J001602-001225. 



317 



318 
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Figure E.ll: Many-multiplet fit for the z = 1.147 absorber toward J001602-001225. 
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Figure E.12: Many-multiplet fit for tlie z = 2.029 absorber toward J001602-001225. 
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Figure E.13: Many-multiplet fit for the z = 2.110 absorber toward J004131-493611. 
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Figure E.14: Many-multiplet fit for the z = 2.249 absorber toward J004131 -493611 
(part 1). 
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Figure E.15: Many-multiplet fit for the z = 2.249 absorber toward J004131-493611 
(part 2). 
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Figure E.16: Many-multiplet fit for tlie z = 1.268 absorber toward J005758-264314. 
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Figure E.17: Many-multiplet fit for the z = 1.534 absorber toward J005758-264314. 
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Figure E.18: Many-multiplet fit for tlie z = 1.797 absorber toward J010311+131617. 
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Figure E.19: Many-multiplet fit for the z = 2.309 absorber toward J010311+131617. 
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Figure E.20: Many-multiplet fit for tlie z = 1.933 absorber toward J010821+062327. 
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Figure E.21: Many-multiplet fit for the z = 1.183 absorber toward J011143-350300. 
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Figure E.22: Many-multiplet fit for tlie z = 1.348 absorber toward J011143-350300. 
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Figure E.23: Many-multiplet fit for the z = 0.822 absorber toward J012417-374423. 
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Figure E.24: Many-multiplet fit for tlie z = 0.859 absorber toward J012417-374423. 
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Figure E.25: Many-multiplet fit for the z = 1.243 absorber toward J012417-374423. 
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Figure E.26: Many-multiplet fit for tlie z = 1.910 absorber toward J012417-374423. 
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E. Many-multiplet Voigt profile fits 
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Figure E.27: Many-multiplet fit for the z = 1.857 absorber toward J013105-213446. 
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Figure E.28: Many-multiplet fit for tlie z = 0.340 absorber toward J014333-391700. 
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Figure E.29: Many-multiplet fit for the z = 1.710 absorber toward J014333-391700. 
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Figure E.30: Many-multiplet fit for tlie z = 0.769 absorber toward J015733-004824. 
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Figure E.31: Many-multiplet fit for the z = 1.185 absorber toward J024008-230915. 
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Figure E.32: Many-multiplet fit for tlie z = 1.636 absorber toward J024008-230915. 
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Figure E.33: Many-multiplet fit for the z = 1.637 absorber toward J024008-230915. 
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Figure E.35: Many-multiplet fit for the z = 0.763 absorber toward J033106-382404. 
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Figure E.36: Many-multiplet fit for tlie z = 0.971 absorber toward J033106-382404. 
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Figure E.37: Many-multiplet fit for the z = 1.438 absorber toward J033106-382404. 
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Figure E.38: Many-multiplet fit for tlie z = 0.993 absorber toward J033108-252443. 



345 



346 



E. Many-multiplet Voigt profile fits 



O 




L 5*0 I 5"0 

xn|j p9zi|DuujoN 

Figure E.39: Many-multiplet fit for the z = 2.455 absorber toward J033108-252443. 
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Figure E.40: Many-multiplet fit for tlie z = 2.411 absorber toward J033244-445557. 
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Figure E.41: Many-multiplet fit for the z = 2.656 absorber toward J033244-445557. 
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Figure E.42: Many-multiplet fit for tlie z = 2.413 absorber toward J040718-441013. 
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Figure E.43: Many-multiplet fit for the z = 2.550 absorber toward J040718-441013. 
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Figure E.44: Many-multiplet fit for tlie z = 2.595 absorber toward J040718-441013. 
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Figure E.45: Many-multiplet fit for the z = 2.621 absorber toward J040718-441013. 
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Figure E.46: Many-multiplet fit for tlie z = 1.408 absorber toward J042707- 130253. 
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Figure E.47: Many-multiplet fit for the z = 1.563 absorber toward J042707-130253. 
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Figure E.48: Many-multiplet fit for tlie z = 2.035 absorber toward J042707-130253. 
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Figure E.49: Many-multiplet fit for the z = 1.355 absorber toward J043037-485523 
(part 1). 
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Figure E.50: Many-multiplet fit for the z = 1.355 absorber toward J043037- 
(part 2). 
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Figure E.51: Many-multiplet fit for the z = 1.433 absorber toward J044017-433308. 
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Figure E.52: Many-multiplet fit for tlie z = 2.048 absorber toward J044017-433308. 
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Figure E.53: Many-multiplet fit for the z = 0.222 absorber toward J051707-441055. 



360 



361 




L 9*0 L Q'O L g-0 
xny p9zi|DLUJON 



Figure E.54: Many-multiplet fit for tlie z = 0.429 absorber toward J051707-441055. 
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Figure E.55: Many-multiplet fit for the z = 2.141 absorber toward J053007-250329. 
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Figure E.56: Many-multiplet fit for tlie z = 1.226 absorber toward J055246-363727. 
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Figure E.57: Many-multiplet fit for the z = 1.748 absorber toward J055246-363727. 
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Figure E.58: Many-multiplet fit for tlie z = 1.957 absorber toward J055246-363727. 
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Figure E.59: Many-multiplet fit for the z = 2.659 absorber toward J064326-504112. 
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Figure E.60: Many-multiplet fit for tlie z = 1.332 absorber toward J091613+070224. 
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Figure E.61: Many-multiplet fit for the z = 1.060 absorber toward J094253-110426. 
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Figure E.62: Many-multiplet fit for tlie z = 1.789 absorber toward J094253-110426. 
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Figure E.63: Many-multiplet fit for the z = 1.443 absorber toward J103909-231326. 
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Figure E.64: Many-multiplet fit for tlie z = 2.778 absorber toward J103909-231326. 
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Figure E.65: Many-multiplet fit for the z = 0.877 absorber toward J103921-271916. 
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Figure E.66: Many-multiplet fit for tlie z = 1.009 absorber toward J103921-271916. 
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Figure E.67: Many-multiplet fit for the z = 1.913 absorber toward J103921-271916. 
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Figure E.68: Many-multiplet fit for tlie z = 1.972 absorber toward J103921-271916. 
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E.69: Many-multiplet fit for the z = 1.386 absorber toward J104032-272749 
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Figure E.70: Many-multiplet fit for the z = 1.386 absorber toward J104032-272749 
(part 2). 
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Figure E.71: Many-multiplet fit for the z = 1.776 absorber toward J104032-272749. 



378 



379 




I 5"0 L g-0 L g-0 L 5"0 
xny p9zi|DLUJON 



Figure E.72: Many-multiplet fit for tlie z = 1.187 absorber toward J110325-264515. 
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Figure E.73: Many-multiplet fit for the z = 1.203 absorber toward J110325-264515. 
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Figure E.74: Many-multiplet fit for tlie z = 1.552 absorber toward J110325— 264515. 
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Figure E.75: Many-multiplet fit for the z = 1.839 absorber toward J110325-264515. 
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Figure E.76: Many-multiplet fit for tlie z = 3.608 absorber toward J111113-080401. 
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Figure E.78: Many-multiplet fit for tlie z = 0.806 absorber toward J112442-170517. 
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Figure E.79: Many-multiplet fit for the z = 1.234 absorber toward J112442-170517. 
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Figure E.80: Many-multiplet fit for tlie z = 1.774 absorber toward J115411+063426. 
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Figure E.81: Many-multiplet fit for the z = 1.820 absorber toward J115411+063426. 
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Figure E.83: Many-multiplet fit for the z = 0.791 absorber toward J115944+011206. 
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Figure E.84: Many-multiplet fit for tlie z = 1.330 absorber toward Jl 15944+01 1206. 
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Figure E.85: Many-multiplet fit for the z = 1.944 absorber toward J115944+011206. 
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Figure E.86: Many-multiplet fit for tlie z = 1.322 absorber toward J120342+102831. 



393 



394 



E. Many-multiplet Voigt profile fits 



O 




L g-0 L 9"0 I 9*0 L 9*0 
xn|j p9zi|DuujoN 

Figure E.87: Many-multiplet fit for the z = 1.342 absorber toward J120342+102831. 
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Figure E.88: Many-multiplet fit for tlie z = 1.579 absorber toward J120342+102831. 
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Figure E.89: Many-multiplet fit for the z = 1.050 absorber toward J121140+103002. 
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Figure E.91: Many-multiplet fit for the z = 0.831 absorber toward J123200-022404. 
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Figure E.92: Many-multiplet fit for tlie z = 1.020 absorber toward J123437+075843. 
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Figure E.93: Many-multiplet fit for the z = 1.719 absorber toward J123437+075843. 
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Figure E.94: Many-multiplet fit for tlie z = 0.745 absorber toward J133335+164903. 
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Figure E.95: Many-multiplet fit for the z = 1.325 absorber toward J133335+164903. 
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Figure E.96: Many-multiplet fit for tlie z = 1.777 absorber toward J133335+164903. 
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Figure E.97: Many-multiplet fit for the z = 1.786 absorber toward J133335+164903. 
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Figure E.98: Many-multiplet fit for tlie z = 1.915 absorber toward J133427- 103541. 
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Figure E.99: Many-multiplet fit for the z = 2.148 absorber toward J133427-103541. 
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Figure E.lOO: Many-multiplet fit for the z = 1.439 absorber toward J135038-251216. 
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Figure E.lOl: Many-multiplet fit for the z = 1.753 absorber toward J135038-251216. 
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Figure E.102: Many-multiplet fit for the z = 1.419 absorber toward J141217+091624. 
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Figure E.103: Many-multiplet fit for the z = 2.109 absorber toward J141217+091624. 
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Figure E.104: Many-multiplet fit for the z = 2.456 absorber toward J141217+091624. 
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Figure E.105: Many-multiplet fit for the z = 2.668 absorber toward J141217+091624. 
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Figure E.106: Many-multiplet fit for the z = 0.488 absorber toward J143040+014939. 
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Figure E.107: Many-multiplet fit for the z = 1.203 absorber toward J143040+014939. 
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Figure E.108: Many-multiplet fit for the z 
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= 1.241 absorber toward J143040+014939. 
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Figure E.109: Many-multiplet fit for the z = 0.510 absorber toward J144653+011356. 



416 



417 




L 9*0 L Q'O L g-0 
xny p9zi|DLUJON 

Figure E.llO: Many-multiplet fit for the z = 0.660 absorber toward J144653+011356. 
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Figure E.112: Many-multiplet fit for the z = 1.129 absorber toward J144653+011356. 
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Figure E.113: Many-multiplet fit for the z = 1.159 absorber toward J144653+011356. 
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Figure E.114: Many-multiplet fit for the z = 1.585 absorber toward J145102-232930. 
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Figure E.115: Many-multiplet fit for the z = 2.033 absorber toward J200324-325144. 
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Figure E.116: Many-multiplet fit for the z = 3.188 absorber toward J200324-325144. 
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Figure E.118: Many-multiplet fit for the z = 1.738 absorber toward J212912-153841. 
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Figure E.119: Many-multiplet fit for the z = 2.022 absorber toward J212912-153841. 
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Figure E.120: Many-multiplet fit for the z = 2.638 absorber toward J212912-153841. 
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Figure E.121: Many-multiplet fit for the z = 2.768 absorber toward J212912-153841. 
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Figure E.122: Many-multiplet fit for the z = 1.615 absorber toward J213314-464030. 
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Figure E.123: Many-multiplet fit for the z = 2.133 absorber toward J214159-441325. 
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Figure E.124: Many-multiplet fit for the z = 2.383 absorber toward J214159-441325. 
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Figure E.125: Many-multiplet fit for the z = 2.852 absorber toward J214159-441325. 
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Figure E.126: Many-multiplet fit for the z = 0.987 absorber toward J214225-442018. 
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Figure E.127: Many-multiplet fit for the z = 1.053 absorber toward J214225-442018. 
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Figure E.128: Many-multiplet fit for the z = 1.154 absorber toward J214225-442018. 
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Figure E.129: Many-multiplet fit for the z = 1.757 absorber toward J214225-442018. 
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Figure E.130: Many-multiplet fit for the z = 2.113 absorber toward J214225-442018 
(part 1). 
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Figure E.131: Many-multiplet fit for the z = 2.113 absorber toward J214225-442018 
(part 2). 
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Figure E.132: Many-multiplet fit for the z = 2.253 absorber toward J214225-442018. 
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E. Many-multiplet Voigt profile fits 
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Figure E.133: Many-multiplet fit for the z = 2.380 absorber toward J214225-442018. 
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Figure E.134: Many-multiplet fit for the z = 1.627 absorber toward J220734-403655. 



441 



442 
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Figure E.135: Many-multiplet fit for the z = 0.9478 absorber toward J220852-194359. 
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Figure E.136: Many-multiplet fit for the z = 0.9484 absorber toward J220852-194359. 
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E. Many-multiplet Voigt profile fits 
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Figure E.138: Many-multiplet fit for the z = 1.018 absorber toward J220852-194359. 
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E. Many-multiplet Voigt profile fits 
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Figure E.139: Many-multiplet fit for the z = 1.297 absorber toward J220852- 194359. 
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Figure E.140: Many-multiplet fit for the z = 1.920 absorber toward J220852-194359. 
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Figure E.141: Many-multiplet fit for the z = 2.076 absorber toward J220852-194359. 
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Figure E.142: Many-multiplet fit for the z = 0.941 absorber toward J222006-280323. 
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E. Many-multiplet Voigt profile fits 
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Figure E.143: Many-multiplet fit for the z = 0.941 absorber toward J222006-280323. 
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Figure E.144: Many-multiplet fit for the z = 0.942 absorber toward J222006-280323. 
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E. Many-multiplet Voigt profile fits 




L 5*0 I 

Figure E.145: Many-multiplet fit for the z = 1.556 absorber toward J222006-280323. 
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Figure E.146: Many-multiplet fit for the z = 1.628 absorber toward J222006-280323. 
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E. Many-multiplet Voigt profile fits 
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Figure E.147: Many-multiplet fit for the z = 1.413 absorber toward J222756-224302. 
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Figure E.148: Many-multiplet fit for the z = 1.433 absorber toward J222756-224302. 
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Figure E.149: Many-multiplet fit for the z = 1.452 absorber toward J222756-224302. 
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Figure E.150: Many-multiplet fit for the z = 1.640 absorber toward J222756-224302. 
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E. Many-multiplet Voigt profile fits 
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Figure E.151: Many-multiplet fit for the z = 2.152 absorber toward J233446-090812 
(part 1). 
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Figure E.152: Many-multiplet fit for the z = 2.152 absorber toward J233446-090812 
(part 2). 
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E. Many-multiplet Voigt profile fits 
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Figure E.153: Many-multiplet fit for the z = 2.202 absorber toward J233446-090812. 
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Figure E.154: Many-multiplet fit for the z = 2.288 absorber toward J233446-090812. 
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E. Many-multiplet Voigt profile fits 
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Figure E.155: Many-multiplet fit for the z = 2.173 absorber toward J234625+124743. 
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Figure E.156: Many-multiplet fit for the z = 2.572 absorber toward J234625+124743. 
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E. Many-multiplet Voigt profile fits 
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Figure E.157: Many-multiplet fit for the z = 1.108 absorber toward J234628+124858. 
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Figure E.158: Many-multiplet fit for the z = 1.589 absorber toward J234628+124858. 
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Figure E.159: Many-multiplet fit for the z = 2.171 absorber toward J234628+124858. 
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Figure E.160: Many-multiplet fit for the z = 1.796 absorber toward J235034+432559. 
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